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(S7) Abstract 

The present invenuon relates to chimeric ^scripuona) 
activator. P Tne chimeric transcription activator pro** con^js 
It least one composite transcription activation domain (TAD 1 ) 
and" least one DBD. The composite TAD comprises a 
^peptide region containing two or more component poh^pepude 
1 rerions (also referred to as "acttvauon tags'), at test two 
of which do not occur together in tne same gene product _ m 
nature In certain of these embodiments the chunenc protein 
^mnris J a lieand binding domain for allosteric acovation. e.g. 
Trece^ doma^capablelf binding a cell pe^eant liganc .such 
as tetracycline or an analog thereof or a steroid such as RU486 or 
Sdy^for example. In such cases, the ^eric^npuon 
activator selectively activates transcnpuon of a target gene m 
I Z p^ encelf the respective ligand. In other embodtmeots the 
' fhin?e~ uans^ription activator is capable of activaung target gene 
cxp^ion constitutive*. i.e., without the need for the presence of 
any such ligand. 
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TRANSCRIPTIONAL ACTIVATORS AND COMPOSmONS AND USES RELATED THERETO 



Background of the Invention 

A large number of biological and clinical protocols, among others, gene therapy, 
; eduction of biologica! materials, and biological research, depend on the abiUty to ehcrt 
specific and high-level expression of genes encoding RNAs or proteins of therapeunc, 
commercial, or experimental value. A vanety of expression systems have been developed 
including regulated expression systems, involving allosteric on switches triggered by 
tetracvcline RU486 and ecdysone, as well as dimerization based on-off swnches triggered 
0 bv FKlOr FK-CsA. rapamycin and analogs thereof. See e.g. Clackson, ''Controlling 
mammalian gene expression with small molecules" Current Opinion in Chemical Biology 
1997 1-210-718. Still, achieving a sufficiently high level of expression for d.mcal or 
other utility in genetically engineered cells in various contexts, including within whole 
organisms, has often been a Hmiting problem. Various approaches for addressing this 
5 problem, including the search for stronger transcriptional promoters or higher transfection 
efficiencies, have in many cases not met with success. Meanwhile, in various lines of 
research with transcription factors, promising results in transient transfection models have 
not been home out with chromosomally integrated reporter gene constructs. Furthermore 
overexpression of transcription factors is commonly associated with toxicity to the hos 
20 cell Despite those precedents, this invention takes a novel approach to the challenge of 
optimizing gene expression through new uses of, and new designs for transcription factor 
proteins which are expressed within the engineered cells containing the target gene. The 
invention provides improved methods and materials for achieving high-level expression of 
a target gene in genetically engineered cells, including genetically engineered cells withm 
25 whole organisms. 



30 



35 



Summary of the Invention 

In the course of our research on transcription activation, we have quite 
unexpectedly discovered an important phenomenon which may explain the absence of 
greater progress in the discovery and/or optimization of more powerful transcrtpuon 
activating domains. Briefly, we have made the unexpected discovery that in convention! 
transient transfection experiments of the sort typically used in this field to study 
transcription activation, as the number or apparent potency of activation domains 
increases, the observed level of transcription increases to a maximum and then levels off 
or decreases in a phenomenon termed "squelching". However, when analogous 
experiments are conducted using stably incorporated reporter genes in place of transiently 
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transf ected reporter genes, squelching is avoided, and instead, an ummpeaed stru tu« 
™Z relationship can be observed for various transcription acuvauon domatns and 
^cations thereof. With the blinders otherwise imposed by squelchmg now removed, 
Zl ZTZc significant advances in the design and Cementation of transopuon 

< activation domains in chimeric proteins and transcription systems. 

This invention encompasses nucleic acid constructs encoding various chmtenc 
^anscnption activator proteins (also referred to below as "composite actrvators > and » 
Tie embodiments auxiliary chimeric proteins, as well as related matenals and metho^ 
disclosed in greater detail below. A number of illustrative embodunents are htghhghted 

10 below. 

For examp.e. a number of embodiments of the invention involve a nucle.c acd 
encoding a chimeric transcription activator protein which activates the uanscnpuon of a 
TZ l which the chimenc transcription activator protein is targeted. Targeting may be 
"Jd via one or more DNA-binding domains ("DBDs") which bind to a DN A sequence 
,5 which a target gene is operatively linked. The DBD(s) may be present wuhm the 
Ihimeric transcnpnon activator protein or may be provided by an auxmary chtmenc 
protein as disclosed in extensive detail below. 

In certain embodiments of the invention the chuneric transcription activator protein 
contain at least one composite transcription activation domam ("TAD") and at least one 
20 DBD. The composite TAD comprises a continuous po.> T eptide regton containing two o 
m ore component polypeptide regions (also referred to below as ^—^no^ 
wo of which do not occur together in the same gene product » nature. In «~ 
embodunents the chimenc protein comprises a Hgand bmdmg domatn . for allostenc 
action e.g. a receptor domain capable of binding a cell permeant hgand such as 
„ 2 cTchne or an analog thereof or a steroid such as RU486 or ecdysone. for example. In 
such cases, the clumeric transcnption activator selectively activates transcnpuon of 
: g et gene in the presence of the respective Hgand. In other embodiments the clumenc 
Ascription activator is capable of activating target gene expression constuuttvely, ,e., 
without the need for the presence of any such ligand. 

in other embodiments, the chimenc transcnption activator protein contains at least 
one composite TAD compnsing three or more component po.ypeptide regtons. at east two 
of which do not occur together in the same gene product in nature, and at leas ton 
additional domain which is heterologous with respect to at .east one of 
polypeptide reg.ons of the composite TAD. The additional domam(s) may mdud a DBD 
, 5 P receptor domain for an oligomer^ ligand. Oligomerizmg Hgandsare mul uval * 
preferablv cell permeant. compounds, generally having a molecular we.ght below abou 5 
YD and oreferablv below about 2 kD. which mediate the formation of complexes wnh 
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proteins containing receptor domains to which the ligands binds. Non-limiting, illustrative 
examples of oligomerizing ligands include FK1012 (with respect to proteins contanung 
FKBP domains), coumermycin (with respect to proteins containing DNA Gyrase 
domains), fujisporin (with respect to proteins containing cyclophiiin ^KBPdornams. 
respectively) and rapamycin (with respect to proteins containing FKBP and FRAP 
domains, respectively). 

in still other embodiments, the chimeric transcription activator protein contains at 
least one composite TAD comprising a continuous polypeptide region contammg two or 
m ore component polypeptide reg.ons, at least two of which do not occur together in the 
, same gene product in nature, or at least not in the same order, arrangement or number as 
found in the chimeric protein of this invention, and at least one additional domain which is 
heterologous with respect to at least one of the component polypeptide regions of the 
composite TAD. In some of these embodiments, the components are all of human ongm. 
In other of these embodiments, at least one of the heterologous domains composes a 
5 domain derived from an immunophilin, cyclophiiin, calcineurin, FRAP or DNA gyrase 
domain;' a domain derived from a receptor for tetracycline or ecdysone or another steroid: 
or a composite DNA binding domain. 

in still other embodiments, the chimeric transcription activator protein contains at 
least one transcription activation domain, which may be a previously known transcription 
-0 activation domain such as VP16, a novel truncated P 65-derived activation domain 
disclosed in detail below or a composite TAD as described herein, and at least one 
bundling domain. A bundling domain is a domain permitting assembly of complexes of 
two or more proteins each of which comprises a copy of the bundling domain or a close 
variant thereof. Non-limiting examples of bundling domains include domains derived rrom 
^5 P 53 or the E coli lac repressor as well as various leucine zipper domains. Bundling 
domains are distinguished from heterooligomerizing proteins such as the 
FKBP/calcineunn pair, the FKBP/FRAP pair, the cyclophiiin/calcineunn pair, the 
RXR/TBP pair, for example, which form heterooligomers of proteins containing very 
different binding domains. In contrast, bundling domains permit oligomcrizanon between 
30 proteins containing the same bundling domains or very sunilar variants thereot. and do so 
without the need for ligand-mediation. 

In various embodiments, one or more of the DBD, receptor domain, ol.gomerizing 
ligand bindine domain or other additional heterologous domain is heterologous w«h 
respect to at least one of the component polypeptide regi ons of the composite transcription 
35 activation domain. In other embodiments, one or more of such additional heterologous 
domains is heterologous with respect to at least two of the component polypeptide reg.ons 
of the composite transcript.on activation domain. In one embodiment, the compos.te 
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activator protein includes at least two different activation tags from heterologous sources, 
e B activation taes w hl ch are derived from two or more different transcription factors or 
co-activators and which do not naturally occur together in the same protein. While not 
wishing to be bound by any particular mechanistic hypothesis, two or more of the 
< activation tags of a composite activator, more preferably two which are derived from 
" disparate proteins, may recruit TAFs which are not required for basal activauon of a gene^ 
Through the activation tags, the chimeric activator may recapitulate the interacts of 
several otherw.se discrete factors with the polymerase complex, or provide novel contacts 
with that complex, which interactions may recruit and/or stabilize the formanon of 
10 activator-dependent transcriptional complexes. 

In this context, it has also been discovered that a variety of activation tags can be 
repeated multiple times in the same chimeric protein, or multiply assorted therewith, 
with a concomitant increase in transcriptional activation. The repetitive activation tag 
embodiment can be combined with other activation tags, e.g., from the same or 
1 5 heterologous sources as the repeated activation tag(s). 

Another aspect of the invention relates to the identification of novel activation tags. 
In this regard, an activation tag has been identified in the NF-kB transcription factor 
subunit P 65. The so-called "alanine/proline rich" or "AP" activation tag of P 65 extends 
from about ammo acids 361 to about amino acid 450 of that protein. See SEQ ID No. ,. 
20 Similar AP activation tags are also present in, e.g.. the P 53 and CTF proteins. The present 
nwention also contemplates fragments of P 65 (or homologous sequences thereto, e.g 
from P 53 or CTF ) which are about 75, 60, 50, 30 or even 20 amino acid residues in length. 
In other embodiments, the AP activation tag has an amino acid sequence at least 9 3 %. 
90%, 80% or 70% identical to the AP activation tag of SEQ ID No. 2. 
25 The subject chimeric activators can be used to drive high levels of transcription 

from naturally-occurring, or otherwise genomically-integrated genes. The chimeric 
activators of the invention are particularly useful for activating transcripuon of integrated 
single copy genes, which in the past have not successfully transacted at appreciable 
levels In preferred embodiments the level of expression of a chromosomally-imegrated 
30 tareet gene achieved with the novel and/or composite transcription activation doma.ns 
disclosed herein is at least two-fold, preferably three-fold, more preferably five-fold, and 
optimally ten-fold or better greater than expression levels achieved using the VP 16 domain 
as the transcription activation domain in the analgous expression system, as measured ,n a 
scientifically valid comparison. 
35 This invention also encompasses nucleic acid compositions comprising a first 

nucleic acid encoding a clumeric transcription activator protein which contains one or 
more ligand-binding domains for an oligomerizing ligand such as described above and a 
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second nucleic acid encoding a chimeric DNA-btnding protein which comprises at least 
one DMA-binding domain and at least one ligand binding domain for a cell permeant 
heand Thus, the chimeric proteins encoded by these nucleic acids each contain one or 
more ligand-binding domains for binding to an oligomerizing ligand which mediates the 
5 dimerization or higher-order hetero-oligomerization of the chimeric protetns. In some 
embodiments, one or both of the chimeric proteins contains two or more hgand-biding 
domains. The nucleic acid composition may further comprise a target gene construct 
comprising a target gene operatively linked to a transcriptional regulatory element which 
includes a DNA sequence to which the chimeric DNA-binding protein binds. 
,0 This invention also encompasses nucleic acid compositions comprising a first 

nucleic acid encoding a chimeric transcription activator protein which contams one or 
more composite TADs and one or more DBDs, such as described above, and a second 
nucleic acd comprising a target gene construct comprising a target gene operauvely linked 
to a transcriptional regulator," element which includes a DNA sequence to which the the 
15 chimeric transcription activator protein binds. 

A nucleic acid encoding a chimeric protein of this invention may be operably 
linked to a transcriptional regulatory element permitting expression of the chimenc protein 
in cells. The various nucleic acids may be provided in DNA vectors as disclosed below. 

The invention also encompasses methods for engineering cells for the regulated or 
,0 constitutive expression of a heterologous target gene. That method involves introducng 
nucleic acids or nucleic acid compositions of this invention into the cells using methods 
and materials permitting uptake by the cells of the nucleic acids. In embodiments of 
particular interest, the target gene is integrated within the chromosomes of the host cells. 
In some cases the transfected cells are selected and separately recovered from non- 
~ transfected cells. In certatn embodiments the cells are grown in culture. In some 
embodiments the cells are engineered in vivo (within a whole organism), while in other 
embodiments cells are transfected in vitro and the transfected cells and/or their progeny 
are subsequently introduced into whole organisms. In the latter case, the engineered cells 
may be encapsulated prior to introduction into the organism. 
30 Engineered cells which contain one or more nucleic acids or nucleic acid 

compositions of this invention are also encompassed. Again, engineered cells in which the 
target gene is stably integrated w,th,n the cell's chromosomes are of particular interest. 

This invention further encompasses methods for effecting expression of a target 
gene which comprises maintaining genetically engineered host cells as described herein 
35 under conditions suitable for gene expression. In regulated expression embodiments, this 
will involve contacting the cells, in vitro or in vivo, with the ligand or oligomerizing agent 
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which binds to one or both of the chimeric proteins. 

Applications of this invention include transcription of genes, constitutively or in a 
drug-dependent manner in vitro, e.g. for the production of a desired protein which may be 
separately recovered, for achieving higher levels of expression in transection based 

5 assavs (including two-hybrid assays), and for the regulated expression of reqmred v.ral 
genes in producer cells lines used for production of recombinant viruses (e.g. for the 
regulated expression of AAV rep and/or cap genes in host cells used for the productxon of 
recombinant AAV). Other applications include in vivo applications such as the 
constitutive or regulated expression of a target gene of interest in an animal model (e.g. for 

, o research or veterinary purposes) as well as for the constitutive or regulated express.on of a 
target eene of interest in a human subject, e.g. in the case of gene therapy. In the case of 
human'gene therapy, it will often be preferred that the components of the chimenc proteins 
be of human origin and/or that the engineered cells be encapsulated. 

Other features and advantages of the invention will be apparent from the following 

15 detailed description and claims. 
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Brief Description of the Figures 

Figure 1A shows the level of reporter gene expression (in EAP Units) from the 
plasmid SXGAL4-IL2-SEAP transiently transfected into HT1080 cells cotransfected with 
various amounts of GAL4-p65 (diamonds) or GAL4-VP1 6 (squares) expression constructs 
(Activator DNA). Mean values of SEAP activity secreted into the medium are shown 
S.D.). 

Figure IB shows the level of reporter gene expression (in EAP Units) from a clone 
of HT1080 cells (HT1080B) having an integrated plasmid cotransfected with vanous 
amounts of GAL4- P 65 (diamonds) or GAL4-VP16 (squares'* expression constructs. Mean 
values of SEAP activity secreted into the medium are shown (+/- S.D.). 

Figure 1C shows the level of reporter gene expression (in EAP Units) in a pool of 
hundreds of independent HT1080 clones carrying an integrated plasmid P LH-5xGAL4- 
IL2-SEAP cotransfected with various amounts of GAL4-p65 (diamonds) or GAL4-VP16 
(squares) expression constructs. Mean values of SEAP activity secreted into the med.um 
are shown (+/- S.D.). 

Figure 2 shows the transcriptional activity of the integrated SEAP gene from pLH- 
5xGAL4-IL2-SEAP plasmid in the presence (squares) or absence (circles) of co- 
transfected 5xGAL4-IL2-hGH reporter plasmid, as well as the transcriptional activ.ty of 
the transiently transfected 5xGAL4-IL2-hGH reporter plasmid (triangles) m HT1080 cells 
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cotransfected with various amounts of GALl-p65 expression construct (Activator). Mean 
values of SEAP activity and hGH protein secreted into the medium are shown (+/- S.D.). 

Figure 3A is a diagrammatic representation of rapamycin-induced dimenzation of 
two fusion proteins, one containing the GAL4 DNA binding domain fused to FKBP12, 
5 and the other containing the P 65, activation domain fused to FRB, thereby leading to 
stimulation of target gene expression. 

Figure 3B is a diagrammatic representation of rapamycin-induced dimerization of 
two fusion proteins, one containing a GAL4 DNA binding domain fused to the three 
copies of FKBP12, and the other containing the p65, activation domain fused to FRB, 
,0 thereby leading to association of three P 65 activation domains with each GAL4 mononer 
in the presence of rapamycin. 

Figure 3C is a diagrammatic representation of rapamycin-induced dimenzation of 
wo fusion proteins, one containing a GAL4 DNA binding domain fused to one copy of 
FKBP12. and the other containing the "bundled" fusion protein, RLS, which contains the 
, 5 tetramerization domain of lactose repressor between FRB and the p65 activation domain, 
thereby leading to association of four activation domains with each FKBP in a rapamycin- 
dependent manner. 

Figure 4A represents the level of reporter gene expression (SEAP Units) of a stably 
integrated P LH-5xGAL4-IL2-SEAP plasmid in HT1080B cells cotransfected with a 
20 plasmid encoding a GAL4DNA binding domain linked to 1, 2, 3, or 4 FKBP 12 proteins 
(GF1 GF2, GF3, and GF4, respectively) and a either a plasmid encoding FRB rused to 
P 65 activation domain (RS) or a plasmid encoding FRB fused to the E. coli tetramerization 
domain and P 65 activation domain (RSL) in the presence of 10 nM rapamycin. Mean 
values of SEAP activity secreted into the medium following addition of 10 nM rapamyc.n 
25 are shown (+/- S.D ). AD/DBD ratio indicates the ratio between the amount of plasmid 
encoding a encoding an activation domain (AD) and a plasmid encoding a GAL4 DNA 
binding domain (DBD). 

Figure 4B represents the level of reporter gene expression (SEAP Units) of a 
transientlv transfected 5xGAL4-IL2-SEAP plasmid in HT1080 cells cotransfected with a 
30 plasmid encoding a GAL4DNA binding domain linked to 1. 2. 3, or 4 FKBP 12 proteins 
(GF1, GF2, GF3, and GF4, respectively) and a either a plasmid encoding FRB fused to 
P 65 activation domain (RS) or a plasmid encoding FRB fused to the E. col. tetramerization 
domain and p65 activation domain (RSL) in the presence of 10 nM rapamyc.n. Mean 
values of SEAP activity secreted into the medium following addition of 10 nM rapamyc.n 
35 are shown (+/- S.D.). AD/DBD ratio indicates the ratio between the amount of plasmid 
encoding a encoding an activation domain (AD) and a plasmid encoding a GAL 4 DNA 
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binding domain (DBD). 

Figure 5 shows the level of reporter gene expression (SEAP Units) of an integrated 
SEAP gene in HT1080B cells transiently transfected with varying amounts of GAL4-p65 
plasmid in the absence (square) or presence of either 120 nM trichostattn A (diamond) 1 
5 mM sodium butyrate (circle). Median values of SEAP activity secreted into the medium 
are shown (+/- S.D). 

Figure 6 shows the level of reporter gene expression (SEAP Units) of an integrated 
SEAP gene in HT1080B cells transiently transfected with expression vectors encoding 
GAL4 fusion proteins having an activation domain from various transcription factors. 

,0 Figure 7 shows the level of reporter gene expression (SEAP Units) of an integrated 

SEAP gene in HT1080B cells transiently transfected with expression vectors encoding 
GAL4 fusion proteins having no activation domain (G only); 1, 2, or 4 activation domains 
from VP16 (GVP16X1. GVP16X2. and GVP16X3, respectively); 1, 2, 3, or 4 activauon 
domains from P 65 (Gp65Xl. Gp65X2, Gp65X3, and Gp65X4, respectively); or a 

,5 combination of an activauon domain from VP16 and an activation domain from p6> 
(Gp65 + VPl6). 

Figure 8A shows the level of reporter gene expression (SEAP Units) of an 
integrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins having 2. 4. 8. or 12 copies of the V8 (GV8X2. GV8X4. 
20 GV8X8, and GV8X12, respectively). 

Figure 8B shows the level of reporter gene expression (SEAP Units) of an 
integrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins having 1. 2. 3, 4, 5. or 6 copies of the Vc (GVCX2. 
GVCX3, GVCX4, GVCX5. and GVCX6, respectively). 
25 Figure 8C shows the level of reporter gene expression (SEAP Units) of an 

integrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins having 8 V8 copies (GV8X8), 5 Vc copies (GVCX5), 8 
copies of V8 and 5 copies of Vc (GV8X8 + GVCX5), 8 copies of V8 and 5 copies of Vc in 
the reverse order (GVCX5 + GV8X8), or VP16 (GVP16). 
30 Figure 9A shows the level of reporter gene expression (SEAP Units) of an 

integrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins having amino acids 450-550 of P 65 (Gp65 (450-550)); 1, 
"2, or 4 copies of amino acids 361-450 of P 65; or 1. 2. or 4 copies of an activation domain 
from Spl (is this correct?). 
35 Figure 9B shows the level of reporter gene expression (SEAP Units) of an 
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iniegrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins having an Spl activation domain alone (GS) or together 
with one or two copies of the AP domain. 

Fieure 10A shows the level of reporter gene expression (SEAP Units) of an 
5 integrated" SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins various activation domains or combinations thereof. 

Figure 10B shows the level of reporter gene expression (SEAP Units) of an 
integrated SEAP gene in HT1080B cells transiently transfected with expression vectors 
encoding GAL4 fusion proteins various activation domains or combinations thereof. 
10 Figure 1 1 is a schematic representation of a composite activator. 
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Detailed Description of the Invention 
I. General 

Protein-encoding genes in eukaryotes are transcribed by RNA polymerase II (pol 
II), a multisubun.t enzyme that is brought to an appropriate gene promoter (pol II 
promoter) through the assembly of a pre-initiation complex comprising a number of 
general transcription factors. The multisubunit protein complex TFIID is required for 
transcription by most, if not all. promoters targeted by pol II. Whereas the TATA-box 
binding protein (TBP) of TFIID is sufficient for basal transcription, pol II transcription is 
also regulated by gene-specific activator proteins. Activator-dependent transcription 
requires, inter alia. TBP-associated proteins (TAFs) and other transcriptional cofactors. 
One of the important concepts to emerge from studies of eukaryotic gene expression is that 
activators of pol II-dependent transcription are composed of functional modules whose 
abilities to bind to subunits of the ultimate pol II complexes regulates transcriptional 
activity of a nearby gene. 

The present invention pertains to nucleic acid molecules and proteins which can be 
used to regulate the expression of genes in eukaryotic cells. 

One aspect of the present invention relates to chimeric transcriptional activators 
30 ("composite activators") which are denved to include a multiplicity of heterologous 
"activation tags" (further defined infra), e.g., polypeptide sequences capable of affecting 
transcriptional activation, as for example, affecting the assembly or stability of an active 
polymerase complex. It has been discovered that activation tags from disparate proteins 
can' be combined in a single polypeptide, or artificially recruited by a complex to a single 
35 recognition element and retain the ability to synergistically activate transcription. In its 
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simplest elaboration, the chimeric activator includes at least two different activation tags 
from heterologous sources, e.g.. activation tags which are derived from two or more 
different transcription factors or co-activators and which do not naturally occur together in 
the same protein. As described further in the appended examples, constructs of this type 
5 are observed to form potent transcriptional activators with a variety of uses. Accordmgly, 
the invention provides composite transcriptional activators which are derived with 
activation tags from at least two separate proteins. 

In this context, it has also been discovered that a variety of activation tags can be 
repeated multiple times in the chimeric protein, or multiply associated therewith, with a 
,0 concomitant increase in transcriptional activation. This result was unexpected m light of. 
as an example, the belief that squelching observed in the systems of the prior an was a 
result of a rate limiting step involving the availability of general transcription factors, e.g., 
those of the TFI1D complex. Accordingly, another type of composite activator of the 
present invention is one wherein multiple copies of an activation tag are repeated, e.g.. 
l< from 2 to 20 times, in a given composite activator. The repetitive activation tag 
embodiment can be combined with other activation tags, e.g., from the same or 
heterologous sources as the repeated activation tag(s). As described with greater detail 
herein, in either of the above embodiments, the activation tag can itself induce activator 
dependent transcription. Alternatively, the activation tag may be one which alone is 
20 insufficient to induce activator-dependent transcription, but rather produces a synergistic 
effect when provided with a second activation tag which itself has some ability to induce 
activator-dependent transcription. The composite activators of the present invention may 
be generated to include DNA binding domains, ligand binding domains (LBDs) and/or 
oligomerization domains (ODs). 
^5 Another aspect of the invention relates to the identification of novel activation tags. 

In this regard, an activation tag has been identified in the NF-kB transcription factor 
subunit p65. The so-called "alanine/proline rich" or "AP" activation tag of P 65 extends 
from about amino acids 361 to about amino acid 450 of that protein. See SEQ ID No. 2. 
Similar AP activation tags are also present in, e.g.. the p53 and CTF proteins. As 
30 described in the Examples, the presence of one or several copies of the AP domain alone 
in a protein does not provide the ability to induce activator-dependent transcnpuonal 
activation. However, when linked to activation tags which are themselves capable of 
inducing some level of activator-dependent transcription, e.g., another portion of p65 or 
VP 16, the AP activation tag synergizes with the second activation domain to induce an 
3< increase in the level of activated transcription. The AP activation tag can be used, for 
example, to construct chimeric transcription factors, to generate drug screening assays, or 
as a competitive inhibitor of P 65 or other transcription factors which utilize the AP 
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activation tag to form transcriptional complexes. 

Still another aspect of the present invention relates to the observation that, in 
contrast to episomal genes, the level of transcription of an integrated gene is directly 
related to the number of activation tags that can be delivered to the gene. While not 
wishing to be bound by any particular mechanistic theory, the inhibition of transcription of 
an episomal gene in the presence of multiple activation unit might result from the 
sequestration by the activator (unbound to DNA) of either (i) an adaptor component 
necessary to bridge the activator with the basal machinery, and/or (ii) a basal component. 
In contrast, transcription of an integrated gene appears to be enhanced by delivery of 
additional activation units to the promoter sequence(s) of the gene. Thus, the subject 
chimeric activators can be used to drive high levels of transcription from naturally- 
occurring, or otherwise genomically-integrated genes. The chimeric activators of the 
invention are particularly useful for activating transcription of integrated single copy 
genes, which in the past have not successfully transact! vated at appreciable levels. 



15 



II. Definitions 

For convenience, the meaning of certain terms and phrases employed in the 
specification, examples, and appended claims are provided below. 

As used herein, the term "gene" or "recombinant gene" refers to a nucleic acid 
20 molecule comprising an open reading frame and including at least one exon and 
(optionally) an intron sequence. The term "intron" refers to a DNA sequence present in a 
given gene which is not translated into protein and is generally found between exons. 

As used herein, the term "nucleic acid" refers to polynucleotides such as 
deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA). The term 
25 should also be understood to include, as equivalents, derivatives, variants and analogs of 
either RNA or DNA made from nucleotide analogs, and. as applicable to the embodiment 
being described, single (sense or antisense) and double-stranded polynucleotides. 

The term "operably linked'- when referring to a transcriptional regulatory sequence 
and a coding sequence is intended to mean that the regulatory sequence is associated with 
30 the coding sequence in such a manner as to facilitate transcription of the coding sequence 
in an activator-dependent fashion. 

The terms "protein", "polypeptide" and "peptide" are used interchangeably, herein 
when referring to a gene product, e.g., as may be encoded by a coding sequence. 

"Transcriptional regulatory sequence", also termed herein "regulatory element", 
"regulatory sequence" or "regulatory element", are generic terms used throughout the 



35 



PCTAJS97/152I9 

WO 99/10508 

- 12 - 

specification to refer to UNA sequences, such as initiation signals, enhancers, and 
promoters, which induce or control transcription of protein coding sequences with wh,ch 
they are operably linked. The term -enhancer", also referred to herein as "enhancer 
element" is intended to include regulatory elements capable of increasing, stimulating, or 
5 enhancing transcription from a basic promoter. The term "silencer", also referred to 
herein as "silencer element" is intended to include regulatory elements capable of 
decreasing, inhibiting, or repressing transcription from a basic promoter. Regulatory 
elements can also be present in genes other than in 5' flanking sequences. Thus, .t is 
possible that regulatory elements of a gene are located in introns, exons, codmg regions, 
10 and 3' flanking sequences. 

The terms "basic promoter" or "minimal promoter", as used herein, are intended to 
refer to the minimal transcriptional regulatory sequence that is capable of initiating 
transcription of a selected DNA sequence to which it is operably linked. This term is 
intended to represent a promoter element providing basal transcription. A bas.c promoter 
,« frequently consists of a TATA box or TATA-like box and is bound by an RNA 
polymerase and by numerous transcription factors, such as GTFs and TATA box Binding 
Proteins (TBPs). 

The terms "basic promoter" and "regulatory element" further encompass "tissue 
specific" promoters and regulatory elements, i.e., promoters and regulatory elements 
20 which effect expression of the selected DNA sequence preferentially in specific cells (e.g., 
cells of a specific tissue). Gene expression occurs preferentially in a spec.fic cell if 
expression in this cell type is significantly higher than expression in other cell types. The 
terms "promoter" and "regulatory element" also encompass so-called "leaky" promoters 
and "reaulatorv elements", which regulate expression of a selected DNA primarily in one 
,5 tissue, but cause expression in other tissues as well. The terms "promoter" and - regulatory 
element" also encompass non-tissue specific promoters and regulatory elements, ,.e., 
promoters and reeulatory elements which are active in most cell types. Furthermore, a 
promoter or regulatory element can be a constitutive promoter or regulatory element, i.e., a 
promoter or regulatory element which constitutively regulates transcription, as opposed to 
30 a promoter or regulatory element which is inducible, i.e.. a promoter or regulatory element 
which is active primarily in response to a stimulus. A stimulus can be, e.g., a molecule, 
such as a hormone, a cytokine, a heavy metal, phorbol esters, cycl.c AMP (cAMP), or 
retinoic acid. 

The term "core promoter element" is intended to include the TATA box and the 
35 initiator element. 

"DNA recognition sequence" or "DNA recognition element", as those phrases are 
used herein, mean a DNA sequence which is capable of binding to one or more DNA- 
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binding domains, e.g., of a transcription factor. 

The term "initiator" refers to a short, weakly conserved element that encompasses 
the transcription start site and which is important for directing the synthesis of properly 
initiated transcripts. While not wishing to be bound by any one theory, it is thought that 
TFIID contacts the initiator. 

The term "transcription factor" refers to any protein or modified form thereof that 
is involved in the initiation of transcription but which is not itself a part of the polymerase. 
Transcription factors are proteins or modified forms thereof, which interact preferentially 
with specific nucleic acid sequences, i.e., regulatory elements, and which in appropriate 
conditions stimulate transcription C 'transcriptional activators") or repress transcription 
("transcriptional repressors"). Some transcription factors are active when they are ,n the 
form of a monomer. Alternatively, other transcription factors are active in the form ot 
oligomers consisting of two or more identical proteins or different proteins (heterodimer) 
The factors have different actions during the transcription initiation: they may interact with 
other factors, with the RKA polymerase, with the entire complex, with activators, or with 
DNA The factors are generally classifiable into two groups: (i) the general transcription 
factors, and (ii) the transcription activators. Transcription factors usually contain one or 
more regulatory domains. 

The term "regulatory domain" refers to any domain which regulates transcription. 
0 and includes both activation and repression domains. The term "activation domain- 
denotes a domain in a transcription factor which positively regulates (increases) the rate of 
gene transcription. The term "repression domain" denotes a domain in a transcription 
factor which negatively regulates (inhibits or decreases) the rate of gene transcription. 

The term "general transcription factor" used interchangeably herein with the term 
•5 "GTF" and with "basic transcription factor" refers to proteins or protein complexes which 
work in concert with RNA Polymerase II to bring about promoter recognition and accurate 
transcription initiation. These proteins constitute, together with the RNA polymerase II, 
the Transcription Initiation Complex. GTFs include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, 
and TFI1H. These GTFs are usually sufficient to direct basal levels of transcription in 
30 vitro from strong promoters (i.e.. those containing TATA boxes). Several GTF interact 
with one another and<or with RNA Polymerase II. For example, TFIIE interacts w.th 
TFIIH and RNA Polymerase II, TFII F interacts with RNA Polymerase II and with TFIIB. 
and TFII B interacts with TBP from TFIID and RNA Polymerase II. 

The term "transcriptional activator" as used herein refers to a protein or protein 
35 complex which is capable of enhancing the efficiency with which the basal transcription 
complex performs, i.e., activating transcription. Thus, as used herein, a transcriptional 
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acuvator can be a single protein or alternatively it can be composed of several units at least 
some of which are not covalently linked to each other. A transcriptional activator typically 
has a modular structure, i.e., comprises various domains, such as a DNA binding domain. 
' and one or more transcriptional activation tags. Some transcriptional activators may 
5 engage a subset of GTFs. For example, some transcriptional activators may include 
activation tags which collectively contact both TFIID and TFIIB. Other transcripuonal 
activators may contact a single GTF. 

The term "cefaclor" which is used interchangeably herein with the terms "co- 
activator" "adaptor- and "mediator" refers to proteins which either enhance or repress 
,0 transcription in a non-gene specific manner, e.g., which lack intrinsic DNA binding 
specificity. Thus, cefaclors are general effectors. Positively acting cofactors do not 
stimulate basal transcription, but enhance the response to an activator. Positively acting 
cofactors include PCI, PC2, PC3, PC4. and ACF. TAFs which interact directly with 
transcriptional activators are also referred to as cofactors. 
, 5 The term "transcriptional activation tag", also referred to herein as "activation tag", 

"transcriptional activation unit" and "activation unit", refers to a peptide sequence which is 
capable of inducing or otherwise potentiating activator-dependent transcription, either on 
its own or when linked covalently or non-covalently to another transcriptional activation 
unit. As opposed to a transcriptional activator generally, an activation tag corresponds to a 
20 minimal polypeptide sequence which retains the ability to interact directly or indirectly 
with a transcription factor. Of course, unless otherwise clear from the context, where a 
chimeric protein is referred to as "including" or "comprising" an activation tag, it will be 
understood that other portions of the protein from which the tag is derived can be included. 
Transcripuonal activation tags can be rich in cenain amino acids. For example, a 
25 transcriptional activation unit can be a peptide rich in acidic residues, glutamine, proline, 
or serine and threonine residues. Yet other transcriptional activators can be rich in 
isoleucine or basic amino acid residues (see, e.g., Triezenberg (1995) Cur. Opin. Gen. 
Develop. 5:190, and references therein). For instance, an activation tag can be a peptide 
motif of at least about 6 amino acid residues associated with a transcription activation 
30 domain, including the well-known "acidic", "glutamine-rich" and "proline-rich" motifs 
such as the Kl 3 motif from p65, the OCT2 Q domain and the OCT2 P domain, 
respectively. 

A "dimerization domain" is defined as a domain that induces formation of dimers 
between two proteins having that domain, while a "tetramerization domain" is defined as a 
35 domain that induces formation of tetramers amongst proteins containing the 
tetramerization domain. An "oligomerization domain", generic for both dimerization and 
tetramerization domains, facilitates formation of oligomers, which can be of any subunit 
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stoiechiometry (of course greater than one). 

The term "interact" as used herein is meant to include detectable interactions 
. between molecules, such as can be detected using, for example, a yeast two hybrid assay 
or bv immunoprecipitation. The term interact is also meant to include "brnding" 
interactions between molecules. Interactions may be, for example, prote.n-prote.n, 
protein-nucleic acid, protein-small molecule or small molecule-nucleic acid in nature. 

The term "holoenzyme complex" refers to RNA Polymerase Il-containing 
complexes. 

The term "squelching'" which is used interchangeably herein with the term 
, "activator interference" refers to the inhibition of transcription observed when an activator 
is present at artificially high concentrations (Ptashne and Gann (1990) Nature 346:329). 
While not bound bv anv particular theory, this inhibition is understood to result from the 
sequestration bv the activator (unbound to DNA) of either (i) an adaptor component 
necessary to bridge the activator with the basal machinery, and/or (ii) a basal component. 

The term 'subunit", when referring to the subunit of a transcriptional activator, 
refers to any unit of the transcriptional activator, e.g., a transcriptional activation unit, a 
DNA binding domain, or a ligar.d binding domain. 

The term "unit", when referring to a unit of a transcription factor, refers generally 
to a minimum portion of a transcript.on factor having a specific activity, e.g., 
20 transcriptional activation, transcriptional repression, DNA binding, or hgand bmding. 

As used herein, the term "transfection" means the introduction of a nucleic acid, 
e g an expression vector, into a recipient cell by nucleic acid-mediated gene transfer. The 
term "transduction" is generally used herein when the transfection with a nucleic acd is by 
viral delivery of the nucleic acid. "Transformation", as used herein, refers to a process in 
.5 which a cell's genotype is changed as a result of the cellular uptake of exogenous DNA or 
RNA and. for example, the transformed cell expresses a recombinant form of a 
polypeptide or. in the case of anti-sense expression from the transferred gene, the 
expression of a naturally-occurring form of the recombinant protein is disrupted. 

As used herein, the term "transgene" refers to a nucleic acid sequence which has 
30 been introduced into a cell. Daughter cells deriving from a cell in which a transgene has 
been introduced are also said to contain the transgene (unless it has been deleted). A 
transgene can encode, e.g., a polypeptide, partly or entirely heterologous, i.e.. foreign, to 
the transgenic animal or cell into which it is introduced, or, is homologous to an 
endogenous gene of the transgenic animal or cell into which it is introduced, but whicn is 
35 designed to be inserted, or is inserted, into the animal's genome in such a way as to alter 
the genome of the cell into which it is inserted (e.g., it is inserted at a location which 
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difTers from that of the natural gene). Alternatively, a transgene can also be present in an 
episome. A transgene can include one or more transcriptional regulatory sequences and 
any other nucleic acid, (e.g. intron), that may be necessary for optimal expression of a 
selected coding sequence. 
5 As used herein, the term "vector" refers to a nucleic acid molecule capable of 

transporting another nucleic acid to which it has been linked. One type of preferred vector 
is an episome, i.e., a nucleic acid capable of extra-chromosomal replication. Preferred 
vectors are those capable of autonomous replication and/or expression of nucleic acids to 
which they are linked. Vectors capable of directing the expression of genes to which they 
,0 are operatively linked are referred to herein as "expression vectors". In general, expression 
vectors of utility in recombinant DNA techniques are often in the form of "plasmids" 
which refer generally to circular double stranded DNA loops which, in their vector form 
are not bound to the chromosome. In the present specification, "plasmid" and "vector" are 
used interchangeably as the plasmid is the most commonly used form of vector. However. 
15 the invention is intended to include such other forms of expression vectors which serve 
equivalent functions and which become known in the an subsequently hereto. 

"Derived from" as that phrase is used herein indicates a peptide or nucleotide 
sequence selected from within a given sequence. A pepude or nucleotide sequence derived 
from a named sequence may contain a small number of modifications relative to the parent 
20 sequence, in most cases representing deletion, replacement or insertion of less than about 
15%, preferably less than about 10%, and in many cases less than about 5%, of amino 
acid residues or base pairs present in the parent sequence. In the case of DNAs, one DNA 
molecule is also considered to be derived from another if the two are capable of selectively 
hybridizing to one another. 
25 The terms "chimeric", "fusion" and "composite" are used to denote a protein, 

peptide domain or nucleotide sequence or molecule containing at least two component 
portions which are mutually heterologous in the sense that they are not, otherwise, found 
directly (covalently) linked in nature. More specifically, the component portions are not 
found in the same continuous polypeptide or gene in nature, at least not in the same order 
30 or orientation or with the same spacing present in the chimeric protein or composite 
domain. Such materials contain components derived from at least two different proteins or 
genes or from at least two non-adjacent portions of the same protein or gene. Composite 
proteins, and DNA sequences which encode them, are recombinant in the sense that they 
contain at least two constituent portions which are not otherwise found directly linked 
35 (covalently) together in nature. 



PCT/US97/15219 

WO 99/10508 

- 1" - 

iti FTem plarv Chimeric A ctivators 

In one aspect, the invention provides composite activators comprised of a 
multiplicity of activation tags fused to, e.g., a DNA-binding domain, a ligand binding 
domain and/or an oligomerization domain. 
5 A composite transcription activation region consists of a continuous polypeptide 

region containing two or more reiterated or otherwise heterologous activation tags. The 
activation tags comprise polypeptide sequences derived from at least two different 
proteins, polypeptide sequences which do not ordinarily occur in the same orientation 
relative to one another (including reiterated copies of a polypeptide sequence), or 
10 polypeptide sequences which do not occur in nature. 

In one embodiment, at least one of the activation tags recruits TFIIA to a TFIID 
complex to form a "DA" complex, and/or stabilized the formation of a DA complex. 
Other activation tags recruit, or stabilize, complexes including other TAFs and co- 
activators required for activator-dependent transcription. For instance, in certain cases the 
,5 composite activator sequence may recruit TFIIA and TFIIB, TFIIE, TFIIF, or TFIIH. In 
a preferred embodiment, the composite activator includes at least two different activation 
tags from the group of acidic activation tags, proline-rich transcription activation tags, 
serine/threonine-rich activation tags, glutamine-rich activation tags, and AP activation 
tags, and even more preferably, at least two of those activation tags are selected from 
20 disparate proteins (i.e., that do not naturally occur together in the same protein). 

Where multiple copies of a particular activation tag are included in the same 
contiguous polvpeptide, the composite activator preferably includes at least 3 copies of the 
activation tag, but more preferably at least 5, 1 0, 1 5, or even at least 20 cop.es of the tag. 

Shown in Figure 1 1 is a representative example of the subject composite activators. 
25 In the exemplary composite activator, two or more of the activation tags are from different 
proteins. The AP activation sequence and the P 65 (450-550) activation sequence are 
derived from the human NF-(B P 65 subunit, whereas the V8 domain is an acid.c activation 
tag from the N-terminus of VP16, and the Vc activation tag is a C-terminal portion of 
VP16. Each of the p65(450-550), V8, and Vc sequences are capable of inducing activator- 
30 dependent transcription. Presented in the same polypeptide sequence, these activation tags 
induce expression of a responsive gene in a synergistic fashion. On the other hand, the AP 
activation tag on its own is insufficient to induce activator-dependent transcription, but 
when combined with an activation tag such as the V8 activation tag, the presence of the 
AP activation tag increases transactivation of the gene relative to the V8 activation tag 
35 alone. 

Another feature of the illustrated composite activator is the repetition of certain of 
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the activation tag sequences. Reiterating the V8 and Vc activation tags, for example, 
increases the fortitude of the resulting protein as a transcriptional activator, particularly of 
genomically integrated genes. 

As described in more detail below, the ideal number of activation tags to include. 
5 and the relative orientations of the various tags to one another in the composite activators, 
can be readily determined by techniques known in the art, including high throughput 
screening using reporter gene-based assays. 

Based at least in part on the lack of squelching observed from the instant composite 
activators, the subject fusion proteins can include one or more oligomerization sequences 
,0 which nermits non-covalent oligomerization of multiple composite act.vators. For 
instance as Figure 3C shows, including a tetramerization domain permits the formation of 
multimerized complexes of the composite activator sequences. Reiterating from above, 
this embodiment derives in part from the discovery that squelching due to abundance of 
activation tags, particularly for forming the DA complex, is not a hindrance in expression 
15 from genomic sequences. 

As desired, the composite activation sequence can be provided as part of a fusion 
protein includine a DNA binding domain. In other embodiments, such as illustrated in 
Figure 3. the composite actuation sequence can be fused with a ligand binding domain 
which in the presence of a multivalent ligand. can facilitate recruitment of the composite 
20 activator to a DNA-bound complex. The complex can be loaded with multiple activators, 
in a ligand-dependent manner, by inclusion of multiple ligand binding domains. 

In the instance where the composite activator contains one or more oligomerization 
domains and/or ligand binding domains, but is not contiguous with a DNA binding 
domain, the composite activator can be coexpressed in cells with a second protein 
25 including a DNA binding domain and appropriate oligomerization or ligand binding 
domains to form complexes with the composite activator proteins. Thus, composite 
activator proteins can be recniited to a site of transcriptional regulation by interaction with 
a DNA binding protein by oligomerization, which may be constitutive or inducible. 

Techniaues for making the subject fusion proteins are adapted from well-known 
30 procedures. Essentially, the joining of various DNA fragments coding for different 
polypeptide sequences is performed in accordance with conventional techniques, 
emploving blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion 
to provide for appropriate termini, filling in of cohesive ends as appropriate, alkaline 
phosphatase treatment to avoid undesirable joining, and enzymatic ligation. Alternatively, 
35 the fusion gene can be synthesized by conventional techniques including automated DNA 
synthesizers. In another method. PCR amplification of gene fragments can be carried out 
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using anchor primers which give rise to complementary overhangs between two 
consecutive gene fragments. Amplification products can subsequently be annealed to 
generate a chimeric gene sequence (see. for example. Current Protocols in Molecular 
Biology, Eds. Ausubel et al. John Wiley & Sons: 1992). 



A. Activation Tags 

The activation tags useful in the composite activators of the present invention can 
be derived from one or more transcription factors. Polypeptides which can function to 
activate transcription in eukaryotic cells are well known in the art. In particular, 
transcriptional activation domains which contain suitable activation tags have been 
described for many DNA binding proteins and have been shown to retain their activation 
function when the transcriptional activation domain, or a suitable fragment thereof, is 
transferred to a heterologous protein. 

Activation taes can be naturally occurring or can be synthetic, so long as, either 
alone or in combination with other activation tags, they are capable of enhancing the 
efficiencv with which the basal and/or activator-dependent transcription complex 
performs i.e., so lone as they are capable of interacting with a transcription factor or co- 
activator protein. Anv particular activation tag is preferably at least 6 amino acids in 
length, and preferably contains no more than about 300 amino acid residues, though even 
more preferably, less than 200 or even less than 100 residues. 

Naturally occurring activation units include portions of transcription factors, such as a 
thirty amino acid fragment of the C-terminus of VP16 (amino acids 461-490). referred to 
herein as "Vc". 

Other activation units are derivatives of naturally occurring peptides. For example, 
the replacement of one amino acid of a naturally occurring activation unit by another may 
further increase activation. An example of such an activation unit is a derivative of an 
eight amino acid peptide of VP 16, the derivative having the amino acid sequence 
DFDLDMLG. 

Yet other activation units are entirely synthetic. It is known, for example, that 
certain random alignments of acidic amino acids are capable of activating transcription. 
It is well known in the art that certain transcription factors are active only in specific cell 
types, i.e., that transcription factors can act in a tissue specific manner. Without wanting 
to be limited to a specific mechanism of action, it is possible that this tissue specificity 
results from the fact that the transcription factor interacts with specific factors, e.g. 
i cofactors, which are present only in certain cell types. This tissue specificity can be 
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localized to a specific ponion of the transcription factor. In certain transcription factors, 
this ponion is located outside activation domains, whereas in other transcription factors, 
this portion may be localized within an activation domain. Thus, by using act.vat.on tags 
which are functional essentially in specific cells, it is possible to design a transcriptional 
activator of the invention having a certain tissue specificity. 

A preferred polypeptide for use in the fusion protein of the invention is the herpes simplex 
virus virion prote.n 16 (referred to herein as VP16, the amino acid sequence of wh.ch is 
disclosed in Triezenberg, S.J. et al. (1988) Genes Dev. 2:718-729). In one embod.ment, an 
activation tag corresponding to about 127 of the C-terminal ammo acids of VP16 is use± 

, For example, a polypeptide having an amino acid sequence shown in SEQ ID NO: X 
(positions 208-335) can be used as the second polypeptide in the fusion prote.n. In 
another embodiment, at least one copy of about 1 1 amino acids from the C-terminal region 
of VP 16 which retain transcriptional activation ability is used as the act.vat.on tag. 
Preferably, an olieomer of this region (i.e., about 22 amino acids) is used. Su.table C- 

5 terminal peptide portions of VP16 are described in Seipel, K. et al. (EMBO J. (1992) 
13:4961-4968). 

Another example of an acidic activation tag is provided in residues 753-881 of 

GAL4. 

One particularly important source of transcription activation tags which are 
0 featured in a number of embodiments of the invention is is the (human) NF-kB subun.t 
P 65 In one embodiment the chimeric activator contains one or more cop.es of a peptide 
sequence comprising all or part of the p65 sequence spanning residues 450-550, or a 
peptide sequence derived therefrom. In certain embodiments, it has been found that 
extending the p65 peptide sequence to include sequence spanning P 65 residues 361-450, 
,5 e g including the "AP activation tag", leads to an unexpected increase in transcnpuon 
activation. Moreover, a peptide sequence comprising all or a portion of p65(361 -550). or 
peptide sequence derived therefrom, in combination with heterologous act.vat.on tags, can 
yield surprising additional increases in the level of transcription activation. p65-based 
activation domains function across a broad range of promoters and have yielded .ncreases 
30 in transcription levels six-fold, eight-fold and even 14-1 5-fold higher than obtained with 
tandem copies of VP16 which itself is widely recognized as a very potent activation 
domain. 

While the resultant increases in activation potency are dramatic. p65-based 
transcription factors possess additional and unexpected characteristics. For instance, unlike 
35 VP16 the subject p65-based activators do not appear to be toxic to the engineered cells. 
This is clearly of profound practical significance in many applications. It is expected that 
recombinant DNA molecules encoding chimeric proteins which contain a p65 activation 
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tag or peptide sequence derived therefrom, will provide significant advantages for 
heterologous gene expression in its various contexts, including constitutive systems such 
as described herein, as well as in regulated systems such as described in lntemauonal 
patent applications PCT/US94/01617, PCT/US95/10591, PCT/US96/09948 and the like, 
5 as well as in other heterologous transcription systems such as those involving tetracychn- 
based regulation reported by Bujard et al. and those involving steroid or other hormone- 
based regulation. 

One class of p65-based transcription factors contain more than one copy of a p65- 
derived domain. Such proteins will typically contain two to about six copies of a peptide 
,0 sequence comprising' all or a portion of p65(361-550), or peptide sequence derived 
therefrom. 

Other polypeptides with transcriptional activation ability in eukaryotic cells can be 
used to provide actuation tags for the fusion protein of the invention. Transcriptional 
activation domains found within various proteins have been grouped into categories based 
15 upon similar structural features. Types of transcriptional activation domains include, in 
addition to the acidic transcription activation domains, proline-rich transcription activation 
domains, serine/threonine-rich transcription activation domains and glutamine-rich 
transcription activation domains. Examples of proline-rich activation domains include 
amino acid residues 399-499 of CTF/NF1 and amino acid residues 31-76 of AP2. 
20 Examples of serine/threonine-rich transcription activation domains include amino acid 
residues 1-427 of ITF1 and amino acid residues 2-451 of ITF2. Examples of glutam.ne- 
rich activation domains include amino acid residues 175-269 of Octl and amino acid 
residues 132-243 of Spl. The amino acid sequences of each of the above described 
regions, and of other useful transcriptional activation domains, are disclosed in Seipel. K. 
25 et al. (EMBO J. (1992) 13:4961-4968). 

Still other illustrative activation domains and motifs of human origin include the 
activation domain of human CTF, the 18 amino acid (NFLQLPQQTQGALLTSQP) 
glutamine rich region of Oct-2, the N-termmal 72 amino acids of P 53, the SYGQQS repeat 
in Ewing sarcoma gene and an 1 1 amino acid (535-545) acidic rich region of Rel A 
30 protein. 

In addition to previously described transcriptional activation domains, novel 
transcriptional activation tags, which can be identified by standard techniques, are within 
the scope of the invention. The transcriptional activation ability of a polypeptide can be 
assaved by linking the polypeptide to another polypeptide having DNA binding activity 
35 and determining the amount of transcription of a target sequence that is stimulated by the 
fusion protein. For example, a standard assay used in the art utilizes a fusion protein ot a 
putative activation tag and a GAL4 DNA binding domain (e.g., ammo acid residues 1-93). 



10 



15 



PCT/US97/15219 

WO 99/10508 

-22- 

TOs fusion protein is then used to stimulate expression of a reporter gene linked to GAL4 
binding sites (see e.g., Seipel, K. et al. (1992) EMBO J. 1 1 :496M968 and references cted 
therein). 

As set out above, transcriptional activators can also comprise activation tags, which 
in the absence of other activation tags are incapable of activating transcription. For 
example, the activation tag AP from VP16 does not activate transcription in the absence of 
other activation units, even when present in multiple copies. 

Accordingly, the invention provides an AP activation tag, or derivative thereof, 
which mav interact with TF1IA, but which on its own is incapable of actuating 
transcription. In one embodiment, the AP polypeptide has the amino acid sequence of 
residues 361-450 of SEQ ID NO.2, or a fragment thereof. Functionally equ.valent 
derivatives of the AP activation tag can be obtained, for instance, by screening derivatives 
of AP for binding to TF11A and measuring transcriptional activity of such a denvauve in a 
co-transfection assay, as described above. Such equivalents are expected to include forms 
of the activation tag which are tuncated at either the N-terminus or C-terminus or both, 
e g fragments of P 65 (or homologous sequences thereto) which are about 75, 60, 50. 30 
or even 20 amino acid residues in length (e.g., ranging in length from 20-89 amino acids). 
Likewise it is expected that the AP activation tag sequence from P 65 can tolerate ammo 
acid substitutions, e.g., to produce AP tags of at least 95%, 90%. 80% and even 70% 
identity with the AP activation tag sequence of SEQ ID No. 2. 

The multiple activation units and other domains of the transcriptional of the 
invention can be from any eukaryotic species, and it is not necessary that every unit or 
domain be from the same species. Preferred species include vertebrates, such as 
mammals. Even more preferred units or domains are from humans. For use of the 
transcriptional activators of the invention in gene therapy in a subject of a specific species, 
e.g., human, it is preferable to use units and domains from the same spec.es to avoid 
immune reactions against the transcriptional activator or complex. 

The activation units of a transcriptional coactivator can be covalently linked to 
each other in a linear array, i.e.. the NH2-termmus of one activation unit is linked to the 
COOH-terminus of another activation unit. The activation units can be arranged in any 
order However, as described herein, it has been observed, that certain arrangements of 
activation units results in higher levels of transcriptional activation than other 
arrangements. The order in which the activation units should be arranged w.ll depend on 
the result desired, i.e., the degree of transaction that one desires to achieve, and can be 
35 determined, e.g., by performing cotransfection experiments, as described in the Examples. 
Briefly expression vector encoding the activation units in various arrangements linked to. a 
DNA binding domain are cotransfected together with a reporter construct containmg a 
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reporter gene operablv linked to a promoter containing a DNA site recognized by the DNA 
binding domain, and expression of the reporter gene is measured. For such assays, it is 
preferable to use a cell line in which the activation units are known to be acnve. 

In order to optimize the transcriptional activity of a composite activator of the 
5 subject invention, it can be advantageous to shuffle the activation tags relative to one 
another in polypeptide. There are a variety of combinatorial techniques available wh.ch 
can be adapted for creating combinatorial libraries of the subject composite acuvators by 
creating and sampling libraries of proteins in which the activation tags are shuffled ,n the 
polypeptide to some degree relative to one another. 
10 In an exemplary embodiment, the coding sequences for individual activation tags 

are svnthesized to include a Notl cleavage site at both the 5i and 3i end by addition of the 
sequence GCGGCCGCN. which also encodes (Ala)3, an unstructured linker. The benefits 
of unstructured linkers are described below. A variety of nucleic acids encoding such 
activation tag seauences can be treated with Notl, admixed with one another, and rehgated 
,5 to form a combinatorial library representing various positional combinations of the 
different activation tag sequences relative to each other. Those combinat,on(s) of 
activation tags product the greatest level of transcriptional activation can be read.ly 
identified in the resulting comb.natorial library, e.g., by selections based on expression of 
a quantifiable reporter gene. 
20 In another embodiment, combinatorial intron splicing can be used to generate a 

diverse library of compos.te activation sequences. U.S. Patent 5,498,531 describes a 
means for carrying out the equivalent of "exon shuffling" by inn on-mediated trans- 
splicing. Briefly, the trans-splicing system of the e531 patent provides an act.ve set of 
transcripts for trans-splicing wherein flanking intronic sequences can interact in an 
-5 intermolecular reaction to reconsitute a reactive complex which promotes 
transesterification of two or more transcripts. RKA transcripts are derived which include 
an "exon" sequence, e.g., encoding an activation tag, which is flanked on each side by 
intron sequences that can direct trans-splicing of the exon sequences to each other. By 
admixing these RNA constructs, intermolecular complementation between the flank.ng 
30 intron sequences of two different constructs forms a functional intron which med.ates the 
transesterification reactions necessary to ligate rwo discontinuous activation tag sequences 
to one another, and thereby generate a elongated transcript comprising these concatenated 
RNA sequences encoding activation tag sequence. In one embodiment, the activauon tag 
"exons" are flanked by portions of one of a group I or group II intron, such that the 
35 interaction of the flanking intronic sequences is sufficient to produce an autocatalyuc core 
capable of driving ligation of the exons in the absence of any other factors. In the present 
context, the term "exon" merely denotes nucleic acid sequences encoding polypeptides 
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including an activation tag, and can, for instance, correspond to naturally and non- 
naturaliv occumng exon sequences. The resulting RNA transcripts are reverse 
transcribed, and as above, the positional permutations of greatest transcriptional actmty 
are easily isolated. 



B. DNA binding domain 

In certain embodiments of the invention, the composite activator is provided as 
part of a chimeric protein which further comprises a DNA-binding domain. In other 
embodiments, the composite activator is localized to a DNA sequence by virture of a 
constitutive or inducible oligomerization with chimeric DNA binding domain. In such 
instances, the DNA binding domain can be provided in a fusion protein which one or more 
oligomerization domains or ligand binding domains. The choice of component DNA- 
binding domains may be influenced by a number of considerations, including the species, 
system and cell type to which is targeted; the feasibility of its incorporation into a chimeric 
proteiivas may be shown by modeling; and the desired application or utility. 

The DNA binding domain can be a naturally occurring DNA-binding domain from 
a transcription factor. Alternatively, the DNA binding domain can be an artificial (or 
partiallv artificial) polypeptide sequence having DNA binding activity. For example, the 
DNA-binding domain can be a naturally occurring DNA binding domain that has been 
modified to recognize a different DNA binding site. The particular DNA-binding domain 
chosen will depend on the target promoter. For example, if the gene to be transcriptionally 
activated by the subject method is an endogenous gene, the DNA-binding domain must be 
able to interact with the promoter of the endogenous gene (endogenous promoter). 
Alternatively, as described in greater detail below, the endogenous promoter could be 
replaced, e.g., by homologous recombination, with a heterologous promoter for which the 
DNA binding domain is selected. Such a substitution may be necessary if no transcription 
factor is known to bind the endogenous promoter of interest. Alternatively, in such a 
situation, it is also possible to clone a DNA-binding domain interacting specifically with a 
sequence in the promoter of interest. This can be done, e.g., by phage display screening 
) with a DNA molecule comprising at least a portion of the promoter of interest. 

Desirable properties of DNA binding domains include high affinity for specific 
nucleotide sequences, termed herein "target sequences", low affinity for most other 
sequences in a complex genome (such as a mammalian genome), low dissociation rates 
from specific DNA sites, and novel DNA recognition specificities distinct from those of 
known natural DNA-binding proteins. Preferably, binding of a DNA-binding domain to a 
specific target sequence is at least two, more preferably three and even more preferably 
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more than four orders of magnitude greater than binding to any one alternative DNA 
sequence as mav be measured by relative Kd values or by relative rates or levels of 
transcription of genes associated with the selected and any alternative DNA sequences. It 
is also preferred that the selected DNA sequence be recognized to a substantially greater 
degree by the DNA binding domain of the trancriptional activator of the invention than by 
an endogenous protein. Thus, for example, target gene expression in a cell is preferably 
two, more preferably three, and even more preferably more than four orders of magnitude 
greater in the presence of the transcriptional activator of the invention containing a DNA- 
binding region than in its absence. 

Preferred DNA binding domains have a dissociation constant for a target sequence 
below 10-8 M , preferably 10* M, more preferably below M, even more preferably 
below 10-" M. For gene therapy applications, they are preferably derived from human 
proteins. 

From a structural perspective, DNA-binding that can be used in the invention may 
15 be classified as DNA-binding proteins with a helix-rum-helix structural design, such as, 
but not' limited to, Myb, Ultrabithorax, Engrailed, Paired, Fushi tarazu, HOX, Unc86, the 
Ets and homeobox families of transcription factors, and the previously noted Octl, Oct2 
and Pit; zinc finger proteins, such as Zif268. SW15. Kr.ppel and Hunchback; steroid 
receptors; DNA-binding proteins with the helix-loop-helix structural design, such as 
20 Daughterless, Achaete-scute (T3), MyoD, El 2 and E47; and other helical motifs like the 
leucine-zipper, which includes GCN4, C/EBP, c-Fos/c-Jun and JunB. The amino acid 
sequences of the component DNA-binding domains may be naturally-occurring or non- 
natural lv-occurring (or modified). DNA-binding domains and their target sites can be 
found at TF SEARCH (http://www.genome.ad:jp/SIT/TFSEARCH html). Another 
25 publicly available database of transcription factors and the sequences to which they bind is 
available from the National Library of Medicine in the "Transcription Data Base". 

One strategy for obtaining component DNA-binding domains with properties 
suitable for this invention is to modify an existing DNA-binding domain to reduce its 
affinity for DNA into the appropriate range. For example, a homeodomain such as that 
30 derived from the human transcription factor Phox 1 , may be modified by substitution of the 
glutamine residue at position 50 of the homeodomain. Substitutions at this posiuon 
remove or change an important point of contact between the protein and one or two base 
pairs of the 6-bp DNA sequence recognized by the protein. Thus, such substitutions 
reduce the free energy of binding and the affinity of the interaction with this sequence and 
35 may or may not simultaneously increase the affinity for other sequences. Such a reduction 
in affinity is sufficient to effectively eliminate occupancy of the natural target site by this 
protein when produced at typical levels in mammalian cells. But it would allow this 
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domain to contribute binding energy to and therefore cooperate with a second linked 
DNA-binding domain. Other domains that amenable to this type of manipulation include 
the paired box, the zinc-finger class represented by steroid hormone receptors, the myb 
domain, and the ets domain. 
5 In another embodiment, the DNA binding domain is created from the assembly of 

DNA bindine domains from various transcription factors, resulting in a DNA binding 
domain having a novel DNA binding specificity. Such DNA binding domains, referred to 
herein as "composite DNA binding domains" can be designed to specifically recognize 
unique binding sites. For example, a DNA binding domain can be constructed that 
10 comprises DNA binding regions from a zinc finger protein and a homeobox protein. One 
such DNA binding domain is ZFHD1, a composite DNA binding domain comprising an 
Oct-1 homeodomain and zinc fingers 1 and 2 of Zif268, which is further described in PCT 
Application WO 96/20951 by Pomerantz et al. 

The DNA sequences recognized by a chimeric protein containing a composite 
15 DNA-binding domain can be determined experimentally, as described below, or the 
proteins can be manipulated to direct their specificity toward a desired sequence. A 
desirable nucleic acid recognition sequence consists of a nucleotide sequence spanning at 
least ten. preferably eleven, and more preferably twelve or more bases. The component 
binding portions (putative or demonstrated) within the nucleotide sequence need not be 
20 fully contiguous; they may be interspersed with "spacer" base pairs that need not be 
directlv contacted by the chimeric protein but rather impose proper spacing between the 
nucleic acid subsites recognized by each module. These sequences should not impart 
expression to linked genes when introduced into cells in the absence of the engineered 
DNA-binding protein. 

25 To identify a nucleotide sequence that is recognized by a transcriptional activator 

protein containing the composite DNA-binding region, preferably recognized with high 
affinity (dissociation constant 10 -11 M or lower are especially preferred), several methods 
can be used. If high-affinity binding sites for individual subdomains of the composite 
DNA-binding region are already known, then these sequences can be joined with various 

30 spacing and orientation and the optimum configuration determined experimentally (see 
below for methods for determining affinities). Alternatively, high-affinity binding sites 
for the protein or protein complex can be selected from a large pool of random DNA 
sequences by adaptation of published methods (Pollock, R. and Treisman, R., 1990, A 
sensitive method for the determination of protein-DNA binding specificities. Nucl. Acids 

35 Res. 18, 6197-6204). Bound sequences are cloned into a plasmid and their precise 
sequence and affinity for the proteins are determined. From this collection of sequences, 
individual sequences with desirable characteristics (i.e., maximal affinity for composite 
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protein. minimal affinity for individual subdomains) are selected for use. Alternatively, the 
collection of sequences is used to derive a consensus sequence that carries the favored base 
pairs at each position. Such a consensus sequence is synthesized and tested (see below) to 
confirm that it has an appropriate level of affinity and specificity. 
5 A number of well-characterized assays are available for determining the binding 

affinitv usually expressed as dissociation constant, for DNA-binding proteins and the 
cognate DNA sequences to which they bind. These assays usually require the preparation 
of purified protein and binding site (usually a synthetic oligonucleotide) of known 
concentration and specific activity. Examples include electrophoretic mobility-shaft assays, 
,0 DNasel protection or "footprinting", and filter-binding. These assays can also be used to 
get rough estimates of association and dissociation rate constants. These values may be 
determined with ereater precision using a BlAcore instrument. In this assay, the synthetic 
oligonucleotide is bound to the assay "chip," and purified DNA-binding protein is passed 
through the fiow-cell. Binding of the protein to the DNA immobilized on the chip is 
,5 measured as an increase in refractive index. Once protein is bound at equilibnum. bufter 
without protein is passed over the chip, and the dissociation of the protein results in a 
return of the refractive index to baseline value. The rates of association and d IS sociat,on 
are calculated from these curves, and the affinity or dissociation constant is calculated 
from these rates. Binding rates and affinities for the high affinity composite sue may be 
20 compared with the values obtained for subsites recognized by each subdomain of the 
protein. As noted above, the difference in these dissociation constants should be at least 
two orders of magnitude and preferably three or greater. 

The invention further provides composite activators of the present invention 
provided as a fusion protein with an inducible DNA binding domam(s). In one 
25 embodiment, the inducible DNA binding domain is the E. coli tet repressor (TetR), which 
binds to tet operator (tetO) sequences upstream of target genes. In the presence of 
tetracycline, or an analog, which bind to tetR, DNA binding is abolished and thus 
transactivation is abolished. This system, in which the TetR had previously been linked to 
transcription activation domains, e.g. from VP16, is generally referred to as an allostenc 
30 "off-switch" described by Gossen and Bujard (Proc Nad. Acad. Sci. U.S.A. (1992) 
89 5547) and in U.S. Patents 5,464,758; 5,650,298; and 5,589.362 by Bu.ard et al. 
Furthermore, depending on the concentration of the antibiotic in the culture medium (0-1 
mu g/ml), target gene expression can be regulated over concentrations up to several orders 
of magnitude. Thus, the svstem not only allows differential control of the activity of an 
35 individual gene in eukaryotic cells but also is suitable for creation of "on/off situations for 
such genes in a reversible way. This system provides low background and relatively high 
target gene expression in the absence of tetracycline or an analog. Thus, the invention 
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described herein provides a method for obtaining even stronger transcriptional induction of 
a target gene which is recusable by the tetracycline system or other inducible DNA 
binding domain. For example, a TetR can be linked to a multiplicity of transcription 
activation units, such that high levels of transcription occur in the absence of tetracycl.ne 

5 or analog thereof and that transcription is repressed in the presence of tetracycline. 

In another embodiment, a '"reverse" Tet system is used, again based on a DNA 
binding domain that is a mutant of the E. coli TetR, but which binds to TetO in the 
presence of Tet Thus, the invention described herein provides a method for obtaining even 
stronger transcriptional induction of a target gene in the presence of tetracycline or an 

, o analog thereof from a very low background in the absence of tetracycline. 
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C. OUgomerization domains 

As set out above, in various embodiments of the composite activators, the fusion 
proteins can also include at least one oligomerization domain. Such a domain can be a 
constitutive oligomerization domain, or an inducible oligomerization domain, i.e., a 
domain mediating oligomerization only in the presence of a third molecule, such as a 
small organic molecule. Examples of constitutive oligomerization domains include 
leucine zippers. 

Example of inducible oligomerization domains include FK506 and cyclosporin 
binding domains of FK506 binding proteins and cyclophilins. and the rapamycin binding 
domain of FRAP (tori). Such inducible oligomerization domains are referred to herein as 
"ligand binding domains" and are further described herein under the section entitled 
accordingly. 

In one embodiment of the invention, at least one activation tag or ligand binding 
domain or DNA binding domain is linked to a constitutive oligomerization domain, e.g.. a 
dimerization or tetramerization domain. A dimerization domain is defined herein as a 
sequence of amino acids capable of forming homodimers or heterodimers. One example 
of a dimerization domain is the leucine zipper (LZ) element. Leucine zippers have been 
identified, generallv. as stretches of about 35 amino acids containing 4-5 leucine residues 
separated from each other by six amino acids (Maniatis and Abel (1989) Nature 341:24- 
25). Exemplarv leucine zippers occur in a variety of eukaryotic DNA binding proteins, 
such as GCN4 C/EBP, c-Fos, c-Jun, c-Myc and c-Max. Other dimerization domains 
include helix-loop-helix domains (Murre, C. et al. (1989) Cell 58:537-544). Dimerization 
domains may also be selected from other proteins, such as the reunoic acid receptor, the 
thvroid hormone receptor or other nuclear hormone receptors (Kurokawa et al. (1993) 
Genes Dev. 7:1423-1435) or from the yeast transcription factors GAL4 and HAP1 
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(Marmonstein et al. (1992) Nature 356:408^14; Zhang et al. (1993) Proc. Natl. Acad. Sc 
USA 90:2851-2855). Dimerization domains are further described in U.S. Patent No. 
5,624.818 by Eisenman. 

In another embodiment, the oligomerization domain is a tetramerization domain. 
For example four activation units can be linked to a single DNA binding domain or a 
ligand binding domain by covalently linking the activation units to a tetramerization 
domain In a preferred embodiment, the tetramerization domain is the E. col. lactose 
repressor tetramerization domain (amino acids 46-360; Chakerian et al. (1991) J Biol. 
Chem 266:1371; Alberti et al. (1993) EMBO J. 12:3227; and Lewis et al. (1996) Nature 
-71 1247) as described in the Examples. Thus, the inclusion of a tetramerizauon domam 
in a transcriptional act.vator allows four activation domains to be complexed together and 
form a transcriptional activator complex. Furthermore, more than one activat.on unit can 
be linked to one tetramerization domain, to thereby form a transcriptional acnvator 
complex comprising more than 4 activation units. 
5 In another embodiment, the tetramerization domain is that from a P 53 protein. The 

P 53 tetramerization domain maps to residues 322-355 of P 53 (Wang et al. (1994) Mol. 
Cell. Biol. 14:5182; Clore et al. (1994) Science 265:386) and is further descnbed m U.S. 
Pat. No. 5,573,925 by Halazonetis. 

The invention also provides for transcriptional activators containing at least one 
0 modified olmomerization domain. Modifications in the oligomerization domain may- 
increase the"stabilitv of tetramer formation, for example, substitutions that sta bl hze 
oligomenzation driven by leucine zippers are known (Krylov et al. (1994) cited above; 
O'Shea et al (1992) c.ted above). As an exemplary- modification of th,s type, residues 174 
or 175 of human p53 are substituted by glutamine or leuc.ne, respectively, in a P 53 
!5 chimeric protein of this invention. 

In other embodiments, the oligomerization domain can be an altered P 53 
tetramerization domain wh.ch is incapable of forming hetero-tetramers with P 53 protems 
that have a wild-tvpe P 53 tetramerization domain, such as wild-type p53 or tumor-denved 
P 53 mutants. Such altered p53 tetramerization domains are further described in U.S. Pat. 
30 No. 5,573,925 by Halazonetis. 

These altered p53 tetramerization domains are characterized by disruption of the 
native p53 tetramerization domain and insertion of a heterologous oligomerization 
domain in a wav that preserves tetramerization. According to this invention, a disruption 
of the p53 tetramenzation domain, involving residues 335-348 or a subset of these 
35 residues, sufficient disrupts the function of this domain so that it can no longer dnve 
tetramerization with wild-type P 53 or tumor-derived P 53 mutants. At the same time, 
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however introduction of a heterologous dimerization domain reestablishes the ability to 
form tetramers, which is mediated both by the heterologous dimerization domain and by 
the residual tetramerization domain of p53. 

Other exemplary suitable tetramerization domains include artificial 
< tetramerization domains, such as variants of the GCN4 leucine zipper that form tetramers 
(Albeni et al. (1993) EMBO J. 12:3227-3236; Harbury et al. (1993) Science 262:1401- 
1407- Krylov et al. (1994) (1994) EMBO J. 13:2849-2861). One of skill in the art could 
readily select alternate tetramerization domains. For example, the tetrameric vanant of 
GCN4 leucine zipper described in Harbury et al. (1993), supra, has isoleucines at posmons 
,0 d of the coiled coil and leucines at positions a, in contrast to the original zipper winch has 
leucines and valines, respectively. 

The GCN4 leucine zipper drives parallel subunit assembly [Harbury et al. (1993), 
cited above], while the native p53 tetramerization domain drives antiparallel assembly 
[Clore et al. (1994) cued above; Sakamoto et al. (1994) Proc. Natl. Acad. Sc. USA 
15 91:8974-8978]. Thus, various conformations of activation unit complexes can be obtained 
by choosing various tetramerization domains. 

In addition, the art also provides a variety of techniques for identifying other 
naturally occurring oligomerization domains, as well as oligomerization domains derived 
from mutant or otherwise artificial sequences. See, for example, Zeng et al. (1997) Gene 
20 185:245; O'Shea et al. (1992) Cell 68:699-708; Krylov et al. [cited above]. 

The distance between the oligomerization domain and other components of the 
fusion proteins can be varied. In one embodiment, there is no linker between an actuation 
unit and a tetramerization domain, e.g. an altered GCN4 leucine zipper. In other 
embodiments however, there are glutamic acid or asparagine or isoleucme l.nkers, 
2< respectively. Linkers may be present for cloning convenience or to confer some useful 
property For example, residues that stabilize specific secondary structure elements, such 
as alpha -helices, are known (Richardson et al. (1988) Science 240:1648-1652]. Such 
residues can be introduced in the linkers to stabilize the oligomerization domains. For 
example the linkers glycine-asparagine, arginine-glycine- asparagine, arginine-glycine- 
30 glycine-asparagine-proline-glutamic acid, glycine-glycine- asparagine-glutamine-alanine, 
are all designed to stabilize the N-terminus of the alpha -helical oligomerization doma.n. 

In one embodiment, the chimeric protein comprises an activation unit fused to an 
asparagine linker and then to a tetrameric variant of GCN4 residues 249-281. 
Alternatively, the linker can be an arginine-glycine-asparagine linker, an arginine-glycine- 
35 glycine-asparagine- proline-glutamic acid linker, a glycine-glycine-asparagine-glutamine- 
alanine linker. 
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A variety of other amino acid or peptide linkers may be used for the reasons 
discussed above, provided they do not interfere with the function of the activation un.ts 
and ligand binding domain or DNA binding domain. 
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D. Ligand binding domain 

In another embodiment of the invention, components of the subject system include 
one or more ligand binding domains for mediating oligomerization of fusion proteins in a 
ligand-dependent fashion. In a preferred embodiment, the ligand is capable of interacting 
with two ligand binding domains. In an exemplary embodiment, formation of 
transcriptional complexes is regulated by addition of a ligand, and comprises introducing 
into the cell a fusion protein including a composite activator and a ligand binding domain, 
as well as a second fusion protein including a DNA binding domain and a ligand binding 
domain, such that in the presence of the ligand. a transcriptional activator complex is 
formed between the two fusion proteins. Preferred ligands include macrolides such as 
,5 rapamvcin, cyclosporin A, FK506, FK1012, and analogs thereof, and other synthetic 
dimerizers or oligomenzers. Ligand binding domains include the FK506 binding domain 
of FKBP. the cyclosporin-binding domain of calcineurin, and the rapamycin-binding 
domain of FRAP. These binding domains and ligands are further disclosed, e.g., in 
PCT/US93/01617. Such fusion proteins permit control of the expression of a target gene to 
20 be dependent on addition of an appropriate ligand, e.g.. one which is capable of interacting 
simultaneously with the two ligand binding domains. 

In general, the ligand binding domain of a chimeric protein of this invention can be 
anv convenient domain which will allow for ligand-dependent oligomerization of fusion 
proteins using a natural or unnatural ligand, preferably an unnatural synthetic ligand. Of 
25 particular interest are binding proteins for which ligands (preferably small organic ligands) 
are known or may be readily produced. These receptors or ligand binding domains include 
the FKBPs and cvclophilin receptors, the steriod receptors, the tetracycline receptor, the 
other receptors indicated above, and the like, as well as "unnatural" receptors, which can 
be obtained from antibodies, particularly the heavy or light chain subunit, mutated 
sequences thereof, random amino acid sequences obtained by stochastic procedures, 
combinatorial syntheses, and the like. 

For the most part, the receptor domains will be at least about 50 amino acids, and 
fewer than about 350 amino acids, usually fewer than 200 amino acids, either as the 
natural domain or truncated active portion thereof. Preferably the binding domain will be 
35 small (<25 kDa. to allow efficient transfection in viral vectors), monomeric (this rules out 
the avidin-biotin svstem), nonimmunogenic, and should have synthetically accessible, cell 
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permeable, nontoxic ligands that can be configured for dimerization. 

Multimerizing ligands useful in practicing this invention are multivalent, i.e., 
capable of binding to, and thus multimerizing, two or more of chimeric protein molecules 
having a ligand binding domain. The multimerizing ligand may bind to the proteins 
5 containing such ligand-binding domains, in either order or simultaneously, preferably w,th 
a Kd value below about 10-« more preferably below about 10-'. even more preferably 
below about 10-8, ^ in som e embodiments below about 10-9 M. The ligand preferably is 
not a protein or polypeptide and has a molecular weight of less than about 5 kDa, 
preferably below 2 kDa. The ligand-binding domains of the chimeric proteins so 
,0 multimerized mav be the same or different. Ligand binding domains include among 
others, various immunophilin domains. One example is the FKBP domain which is 
capable of bindine to dimerizing ligands incorporating FK506 moieties or other FKBP- 
binding moieties. See e.g. PCT/US93/01617, the full contents of which are hereby 
incorporated by reference. 
, s The portion of the construct encoding the ligand binding domain can be subjected 

to mutagenesis for a variety of reasons. The mutagenized domain can provide for higher 
binding affinity, allow for discrimination by a ligand between the mutant and naturally 
occurring forms of the lieand binding domain, provide opportunities to design a ligand- 
ligand binding domain pairs, or the like. The change in the ligand binding domain can 
->o involve changes in amino acids known to be at the binding site, random mutagenesis using 
combinatorial techniques, where the codons for the amino acids associated with the 
binding site or other amino acids associated with conformational changes can be subject to 
mutagenesis by changing the codon(s) for the particular amino acid, either with known 
changes or randomly, expressing the resulting proteins in an appropriate prokaryouc host 
25 and then screening the resulting proteins for binding. Illustrative of this situation is to 
modify FKBP12's Phe36 to Ala and/or Asp37 to Gly or Ala to accommodate a substituent 
at positions 9 or 10 of FK506 or FK520. In particular, mutant FKBP 12 moieties which 
contain Val, Ala, Gly, Met or other small amino acids in place of one or more of Tyr26, 
Phe36, Asp37, Tyr82 and Phe99 are of particular interest as receptor domains for FK506- 
30 type and FK-520-rype ligands containing modifications at C9 and/or C10. 

Illustrative examples of rapamycin-binding domains are those which include an 
approximatelv 89-amino acid rapamycin-binding domain from FRAP, e.g., containing 
residues 2025-21 13 of human FRAP. Similar considerations apply to the generation of 
mutant FRAP-derived domains which bind preferentially to rapamycin analogs (rapalogs) 
35 containing modifications (i.e., are 'bumped') relative to rapamycin in the FRAP-binding 
effector domain. For example, one may obtain preferential binding using rapalogs bearing 
substituents other than -OMe at the C7 position with FRBs based on the human FRAP 
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FRB peptide sequence but bearing amino acid substitutions for one of more of the residues 
Tvr2038 Phe2039. Thr2098, Gln2099, Trp2101 and Asp2102. Exemplary' mutations 
include Y2038H, Y2038L, Y2038V. Y2038A, F2039H, F2039L. F2039A. F2039V. 
D2102A T2098A, T2098N, andT2098S. Rapalogs bearing substituents other than -OH at 
5 C28 and/'or substituents other than =0 at C30 may be used to obtain preferential binding to 
FRAP proteins bearing an amino acid substitution for Glu2032. Exemplary mutations 
include E2032A and E2032S. Proteins comprising an FRB containing one or more amino 
acid replacements at the foregoing positions, libraries of proteins or peptides randomized 
at those positions (i.e., containing various substituted amino acids at those residues), 
,0 libraries randomizing the entire protein domain, or combinations of these sets of mutants 
are made using the procedures described above to identify mutant FRAPs that bind 
preferentially to bumped rapalogs. 

Other macrolide binding domains useful in the present invention, including 
mutants thereof, are described in the art. See, for example, W096/41865, W096/13613. 
15 WO96/06111, WO96/06110, WO96/06097, W096/12796, WO95/05389, WO95/02684, 
W094/18317, each of which is expressly incorporated by reference herein. 

The abilitv to employ in vitro mutagenesis or combinatorial modifications of 
sequences encoding proteins allows for the production of libraries of proteins which can be 
screened for binding affinity for different ligands. For example, one can totally randomize 
20 a sequence of 1 to 5, 1 0 or more codons, at one or more sites in a DNA sequence encoding 
a binding protein, make an expression construct and introduce the expression construct 
into a unicellular microorganism, and develop a library. One can then screen the library 
for binding affinity to one or desirably a plurality of ligands. The best affinity sequences 
which are compatible with the cells into which they would be introduced can then be used 
25 as the ligand binding domain. The ligand would be screened with the host cells to be used 
to determine the level of binding of the ligand to endogenous proteins. A binding profile 
could be defined weighting the ratio of binding affinity to the mutagemzed binding 
domain with the binding affinity to endogenous proteins. Those ligands which have the 
best binding profile could then be used as the ligand. Phage display techniques, as a non- 
30 limiting example, can be used in carrying out the foregoing. 

In other embodiments, antibody subunits, e.g. heavy or light chain, particularly 
fragments, more particularly all or part of the variable region, or fusions of heavy and light 
chain to create single chain antibodies, can be used as the ligand binding doma.n. 
Antibodies can be prepared against haptenic molecules which are physiologically 
35 acceptable and the individual antibody subunits screened for binding affinity. The cDNA 
encoding the subunits can be isolated and modified by deletion of the constant region, 
portions of the variable region, mutagenesis of the variable region, or the like, to obtain a 
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binding protein domain that has the appropriate affinity for the hgand. In this way, almost 
anv physiologically acceptable haptenic compound can be employed as the hgand or to 
provide an epitope for the ligand. Instead of antibody units, natural receptors can be 
employed, where the binding domain is known and there is a useful ligand for bindmg. 

5 In vet another embodiment of the invention, the DNA binding unit is linked to 

more than one ligand binding domain. For example, a DNA binding domain can be Imked 
to at least 2. 3, 4, or 5 ligand binding domains. A DNA binding domain can also be linked 
to at least 5 ligand binding domains or any number of ligand binding domatns. In such 
embodiments, the ligand bindmg domains can be, by illustration, linked to each other m a 

,0 linear array, by linking the NH2-terminus of one ligand binding domam to the COOH- 
terminus of another ligand binding domain, e.g., as shown in Figure 3. Thus, numerous 
composite activators can be linked to a single DNA binding domain in the presence of a 
ligand. 

The invention further provides additional induction systems. In one embodiment. 
,s the invention uses an altemat.ve allosteric on-switch for transcription which employs a 
deletion mutant of the human progesterone receptor, i.e.. which no longer ends 
progesterone or any known endogenous steroid but can be activated by the orally act.ve 
progesterone antagonist RU486, described, e.g. in Wang et al. (1994) Proc. Natl. Acad. 
Sci USA 91:8180. Activation was demonstrated, e.g. in cells transplanted into mice 
^0 usina doses of RU486 (5-50 ug/kg) considerably below the usual dose for inducing 
abortion in humans (10 mg/kg). However, according to the art describing this system, the 
induction ratio in culture and in animals was rather low. Applying the mvenuon 
described herein in this system would provide an inducible system having a higher 
induction ratio. Thus, the invention provides a transcriptional activator comprising at least 
^ 5 one subunit which is covalently linked to a mutant steroid binding domain to yield a 
transcriptional activator which transactivates in a RU486-dependent manner, resulting m 
high induction ratios. 

The invention can be adapted to an ecdysone inducible system. Early work 
demonstrated that fusing the Drosophila steroid ecdysone (Ec) receptor (EcR) Ec- binding 
30 domain to heterologous DNA binding and activation domains, such as E. cob lexA and 
herpesvirus VP16 permits ecdysone-dependent activation of target genes downstream of 
appropriate binding sites (Christopherson e, al. (1992) Proc. Natl. Acad. Sci. U.S.A. 
89 6314) An improved ecdysone regulation system has been developed, using the DNA 
binding domain of the EcR itself. In this system, the regulating transcription factor is 
35 provided as two proteins: (1) a truncated, mutant EcR fused to herpes VP16 and (2) the 
mammalian homolog (RXR) of Ultraspiracle protein (USP), which heterodimenzes with 
the EcR (No et al. (1996) Proc. Natl. Acad. Sci. U.S.A. 93:3346). In this system, because 
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the DNA binding domain was also recognized by a human receptor (the human famesoid 
X receptor) it was altered to a site recognized only by the mutant EcR. Thus, the 
invention provides an ecdysone inducible system, in which a truncated mutant EcR ,s 
fused to at least one subunit of a transcriptional activator of the invention. The 
5 transcriptional activator further comprises USP, thereby providing high level induction of 
transcription of a target gene having the EcR target sequence, dependent on the presence 
of ecdysone. 

The invention can also be applied to any other inducible system, thereby providing 
an inducible system having a higher ratio of background to induction due to the potent 
10 transcriptional activity of the transcriptional activators of the invention. 
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E. Additional domains and linkers 

Additional domains may be included in the transcriptional activators of this 
invention. For example, the transcriptional activators may contain a nuclear localization 
sequence which provides for the protein to be translocated to the nucleus. Accordingly, in 
one embodiment of the invention, at least one of the subunits of the transcriptional 
activator of the invention, e.g.. activation unit, DNA binding domain, or hgand binding 
domain, further comprises a nuclear localization signal (MLS). A NLS can be fused to the 
N-terminus, or the C-terminus of a subunit, e.g., an activation unit, or can be inserted at 
the junction of one subunit with another subunit, e.g., between an activation domain and a 
DNA binding or ligand binding domain or oligomerization domain or elsewhere in the 
protein as long as the function of the subunits is not disrupted by insertion of the NLS. 
Typically a nuclear localization sequence has a plurality of basic amino acids, referred to 
as a bipartite basic repeat (reviewed in Garcia-Bustos et al. (1991) Biochimica et 
Biophysica Acta 1071:83-101). The NLS may be that of SV40 large T antigen wh.ch is 
comprised of amino acids proline-lysine-lysine-lysine-arginine-lysine-valine (Kalderon et 
al (1984) Cell 39:499-509). The NLS may also be from a p53 protein. Wild-type p53 
contains three nuclear localization signals (NLS), all of which map to the C-terminus of 
wild-type P 53 and specifically to residues 316-325, 369-375 and 379-384 of p53 
(Shaulsky et al. (1990) Mol. Cell. Biol. 10:6565-6577). Additional heterologous NLS are 
described by Shaulsky et al (1990) supra and Shaulsky et al.(1991) Oncogene 6:2056. 

The chimeric proteins may include domains that facilitate their purification, e.g. 
"histidine tags" or a glutathione-S-transferase domain. They may include "epitope tags- 
encoding peptides recognized by known monoclonal antibodies for the detection of 
35 proteins within cells or the capture of proteins by antibodies in vitro. 

It may be necessary in some instances to introduce an unstructured polypeptide 
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linker region between an activation lag or tags and other portions of the chimeric activator. 
Where the fusion protein also includes, for example, oligomerization sequences. ,t may be 
preferable to situate the linker between the oligomerization sequences and the activation 
tags The linker can facilitate enhanced flexibility of the fusion protein, while the 
5 oligomerization sequences are relatively free to make other inter-protein contacts, e.g.. 
with other chimeric activators. The linker can also reduce steric hindrance between any 
two fragments of the fusion protein. The linker can also facilitate the appropriate folding 
of each fragment to occur. The linker can be of natural origin, such as a sequence 
determined to exist in random coil between two domains of a protein. An exemplary 
,0 linker sequence is the linker found between the C-terminal and N-terminal domains of the 
RNA polymerase a subunit. Other examples of naturally occurring linkers include linkers 
found in 'the lcl and LexA proteins. Alternatively, the linker can be of synthetic ongin. 
For instance, the sequence (Gly4Ser)3 can be used as a synthetic unstructured linker. 
Linkers of this type are described in Huston et al. (1988) PNAS 85:4879; and U.S. Patent 
15 No. 5,091 ,5 1 3. both incorporated by reference herein. 

In some embodiments it is preferable that the design of a linker involve an 
arraneement of domains which requires the linker to span a relatively short d.stance, 
preferably less than about 10 A. However, in certain embodiments, depending, e.g., upon 
the selected DNA-binding domains and the configuration, the linker may span a distance 
20 of up to about 50 A. 

Within the linker, the ammo acid sequence may be varied based on the preferred 
characteristics of the linker as determined empirically or as revealed by modeling. For 
instance, in addition to a desired length, modeling studies may show that side groups of 
certain amino acids may interfere with the biological activity, e.g. DNA binding or 
25 transcriptional activation, of the protein. Considerations in choosing a linker include 
flexibility of the linker, charge of the linker, and presence of some ammo acids of the 
linker in the naturallv-occurring subunits. The linker can also be designed such that 
residues in the linker contact DNA, thereby influencing binding affinity or specificity, or 
to interact with other proteins. For example, a linker may contain an amino acid sequence 
30 which can be recognized by a protease so that the activity of the chimenc protein could be 
regulated by cleavage. In some cases, particularly when it is necessary to span a longer 
distance between subunits or when the domains must be held in a particular configuration, 
the linker may optionally contain an additional folded domain. 

Most of the subject fusion proteins can be tested for activity in vivo using a simple 
35 assay (F M Ausubel et al. Eds. Current Protocols in Molecular Biology, John Wiley & 
Sons New York, 1994; de Wet et al. (1987) Mol. Cell Biol. 7:725). The in vivo assay 
requires an expression construct containing and capable of directing the expression of a 
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recombinant DNA sequence encoding the composite activator, and as appropriate, other 
proteins required for DNA localization of the activator. The assay also requires a plasmid 
containing a reporter gene , e.g., the luciferase gene, the chloramphenicol acetvl 
transferase (CAT) gene, secreted alkaline phosphatase or the human growth hormone 
5 (hGH) gene, linked to a binding site for the transcription factor. The expression constructs 
are introduced into host cells which normally do not produce interfering levels of the 
reporter gene product. A second group of cells, which lacks the composite activator or the 
means for localizing the activator to the reporter gene can serve as the control. 

The production of mRNA or protein encoded by the reporter gene is measured. An 
,0 increase in reporter gene expression not seen in the controls indicates that the transcnpuon 
factor is a positive regulator of transcription. If reporter gene expression is less than that of 
the control, the transcription factor is a negative regulator of transcription. 

Optionallv, the assay may include a transfection efficiency control plasmid. This 
plasmid expresses a cene product independent of the test gene, and the amount of this gene 
, s product indicates roughly how many cells are taking up the plasmids and how efficiently 
the DNA is being introduced into the cells. Additional guidance on evaluating chimeric 
proteins of this invention is provided below. 
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HI Nucleic Aci d Compositions 

In another aspect of the invention, the proteins described herein are provided in 
expression vectors. For instance, expression vectors are contemplated which include a 
nucleotide sequence encoding a polypeptide containing a composite activator of the 
present invention, which coding sequence is operably linked to at least one transcriptional 
regulatory sequence. Regulatory sequences for directing expression of the instant fusion 
25 proteins are art-recognized and are selected by a number of well understood criteria. 
Exemplary regulator sequences are described in Goeddel; Gene Expression Technology: 
Methods in Enzymology, Academic Press. San Diego. CA (1990). For instance, any of a 
wide variety of expression control sequences that control the expression of a DNA 
sequence when operatively linked to it may be used in these vectors to express DNA 
30 sequences encoding the fusion proteins of this invention. Such useful expression control 
sequences, include, for example, the early and late promoters of SV40, adenov ir us or 
cytomegalovirus immediate early promoter, the lac system, the trp system, the TAC or 
TRC svstem, T7 promoter whose expression is directed by T7 RNA polymerase, the 
promoter for 3-phosphoglycerate kinase or other glycolytic enzymes, the promoters of acid 
phosphatase, e.g.. Pho5, and the promoters of the yeast a-mating factors and other 
sequences known to control the expression of genes of prokaryotic or eukaryotic cells or 
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their viruses, and vanous combinations thereof. It should be understood that the design of 
expression vector may depend on such factors as the ch D1 ce of *e host cell to be 
lansformed. Moreover, the vector's copy number, the ability to control that copy number 
Id the expression of any other protein encoded by the vector, such as anub.ouc markers, 
should also be considered. 

As will be apparent, the subject gene constructs can be used to cause expression of 
the subject fusaon proteins in cells propagated in culture, e.g. to produce protetns or 
polypeptides, including fusion proteins, for purification. 

This invention also pertains to a host cell transfected with a recombinant gene in 
order to express one of the subject polypeptides. Tne host cell may be any prokaryouc or 
eukaryotic cell. For example, a fusion proteins of the present invention may be expressed 
in bacterial cells such as E. coli, insect cells (baculovirus), yeast, or mammahan cells. 
Other suitable host cells are known to those skilled in the art. 

Accordingly, the present invention further pertains to methods of producing the 
subject fusion proieins. For example, a host cell transfected with an express.on vector 
encoding a protein of interest can be cultured under appropriate condmons to allow 
expression of the protein to occur. The protein may be secreted, by indusum of a 
secretion signal sequence, and isolated from a mixture of cells and medium contammg the 
protem. Amatively, the protem may be retained cytoplasmical.y and the cells harvested. 
20 .vsed and the protem isolated. A cell culture includes host cells, med.a and other 
byproducts. Suitable media for cell culture are well known in the an. The protems can be 
isolated from cell culture medium, host cells, or both using techniques known » the art for 
purifying proteins, including .on-exchange chromatography, gel filtration chromatography, 
ultrafiltration, electrophorests. and immunoaffimty purification with antibodies spec.fic for 
^5 particular epitopes of the protein. 

Thus a coding sequence for a fusion protein of the present invention can be used to 
produce a recomb.nant form of the protem via microbial or eukaryottc cellular processes. 
Ligatine the polynucleotide sequence into a gene construct, such as an express.on vector, 
and transforming or transfecting into hosts, either eukaryotic (yeast, avian, msect or 
30 mammalian) or prokaryotic (bacterial cells), are standard procedures. 

Expression vehicles for production of a recombinant protein include plasmids and 
other vectors. For instance, suable vectors for the expresston of the instant fuston 
protems include plasuuds of the types: pBR322-derived plasmids, pEMBL-denved 
plasmids, pEX-denved plasmids, pBTac-denved plasmids and pUC-denved plasm.ds for 
35 expression in prokaryotic cells, such as E. coh. 

A number of vectors exist for the expression of recombinant proteins in yeast. For 
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instance YEP24, YIPS, YEP51, YEP52, pYES2, and YRP17 are cloning and expression 
vehicles useful in the introduction of genetic constructs into S. cerevistae (see. for 
example Broach et al., (1983) in ExperimentaltMampulationtofrGenetExpress.on. ed. 
M Inouve Academic Press, p. 83, incorporated by reference herein). These vectors can 
replicate in E. eoli due the presence of the P BR322 ori, and in S. cerevisiae due to the 
replication determinant of the yeast 2 micron plasmid. In addition, drug reststance 
markers such as ampicillin can be used. 

The preferred mammalian expression vectors contain both prokaryotic sequences 
to facilitate the propagation of the vector in bacteria, and one or more eukaryotic 
transcription units that are expressed in eukaryotic cells. The pcDNAl/amp, pcDNAI/neo^ 
pRc/CMV, P SV2gpt, P SV2neo, P SV2-dhfr, P Tk2, pRSVneo, pMSG, P SVT7, pko-neo and 
v Hyo derived vectors are examples of mammalian expression vectors suitable for 
transaction of eukaryotic cells. Some of these vectors are modified with sequences from 
bacterial plasmids. such as P BR322. to facilitate replication and drug resistance selection 
in both prokarvotic and eukaryotic cells. Alternatively, derivatives of viruses such as tne 
bovine papilloma virus (BPV-1), or Epstetn-Barr virus (pHEBo, pREP-dehved and p20:» 
can be used for transient express^ of proteins in eukaryotic cells. Examples of other 
viral (including retroviral) expression systems can be found below in the description of 
gene therapv delivery systems. The various methods employed in the preparation of the 
plasmids and transformauon of host organisms are well known in the art. For other 
suitable expression systems for both prokaryotic and eukaryotic cells, as well as general 
recombinant procedures, see Molecular Clohing.tA LaboratorytManual, 2nd Ed., ed. by 
Sambrook. Fntsch and Maniatis (Cold Spring Harbor Laboratory Press, 1989) Chapters 16 
and 17 In some instances, it may be desirable to express the recombinant fusion protems 
bv the use of a baculovirus expression system. Examples of such baculovirus expression 
svstems include pVL-derived vectors (such as P VL1392, P VL1393 and P VL941). 
pAcUW-derived vectors (such as pAcUWl). and pBlueBac-denved vectors (such as the _- 
gal containing pBlueBac 111). 

In yet other embodiments, the subject expression constructs are derived by 
insertion of the subject gene into viral vectors including recombinant retroviruses 
adenovirus, adeno-associated virus, and herpes simplex virus-1, or recombinant bacterial 
or eukarvotic plasmids. As described in greater detail below, such embodiments of the 
subject expression constructs are specifically contemplated for use in various in vivo and 
ex vivo gene therapy protocols. 

Retrovirus vectors and adeno-associated virus vectors are generally understood to 
be the recombinant gene delivery system of choice for the transfer of exogenous genes m 
vivo particularlv into humans. These vectors provide efficient delivery of genes into 
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cells and the transferred nucleic acids are stably integrated into the chromosomal DNA of 
the host A major prerequisite for the use of retroviruses is to ensure the safety of their 
use particularly with regard to the possibility of the spread of wild-type v.rus .n the cell 
population. The development of specialized cell lines (termed "packaging cells") wh.ch 
< produce only replication-defective retroviruses has increased the utility of retroviruses for 
gene therapy, and defective retroviruses are well characterized for use in gene transfer for 
gene therapy purposes (for a review see Miller, A.D. (1990) Blood 76:271). Thus, 
recombinant retrovirus can be constructed in which part of the retroviral coding sequence 
(gag pol, env) has been replaced by nucleic acid encoding a fusion protein of the present 
,0 invention, e.g., a composite activator, rendering the retrovirus replication defective. The 
replication defective retrovirus is then packaged into virions which can be used to infect a 
target cell through the use of a helper virus by standard techniques. Protocols for 
producing recombinant retroviruses and for infecting cells in vitro or in vivo with such 
viruses can be found in Current Protocols in Molecular Biology, Ausubel, F.M. et al., 
,5 (eds) Greene Publishing Associates, (1989), Sections 9.10-9.14 and other standard 
laboratcrv manuals. Examples of suitable retroviruses include pLJ, pZIP, pWE and P EM 
which are well known to those skilled in the art. Examples of suitable packaging virus 
lines for preparing both ecotropic and amphotropic retroviral systems include SYMBOL 
121 \f "SvmborCrip, SYMBOL 121 \f "Symbol"Cre, SYMBOL 121 \f "Symbol^ and 
20 SYMBOL 121 \f "Svmbol"Am. Retroviruses have been used to introduce a variety of 
genes into manv different cell types, including neural cells, epithelial cells, endothelial 
cells Ivmphocvtes. mvoblasts, hepatocyies. bone marrow cells, in vitro and/or in vivo (see 
for example Eglitis et al.. (1985) Science 230:1395-1398; Danos and Mulligan. (1988) 
PNAS USA 85:6460-6464; Wilson et al., (1988) PNAS USA 85:3014-3018: Armentano et 
25 al., (1990) PNAS USA 87:6141-6145; Huber et al., (1991) PNAS USA 88:8039-8043; 
Ferry et al . (1991) PNAS USA 88:8377-8381; Chowdhury et al., (1991) Science 
254 180^-1805; van Beusechem et al.. (1992) PNAS USA 89:7640-7644: Kay et al., 
(1992) Human Gene Therapy 3:641-647; Dai et al., (1992) PNAS USA 89:10892-10895: 
Hwu et al., (1993) J. Immunol. 150:4104^115; U.S. Patent No. 4 ? 868,116; U.S. Patent 
30 No. 4,980.286; PCT Application WO 89/07136; PCT Application WO 89/02468; PCT 
Application WO 89/05345; and PCT Application WO 92/07573). 

Furthermore, it has been shown that it is possible to limit the infection spectrum of 
retroviruses and consequently of retroviral-based vectors, by modifying the viral 
packaging proteins on the surface of the viral particle (see, for example PCT publications 
35 W093/25234, WO94/06920, and W094/11524). For instance, strategies for the 
modification of the infection spectrum of retroviral vectors include: coupling antibodies 
specific for cell surface antigens to the viral env protein (Roux et al., (1989) PNAS USA 
86 9079-9083; Julan et al., (1992) J. Gen Virol 73:3251-3255; and Goud et al., (1983) 
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defective virus that requires another virus, such as an adenovirus or a herpes virus, as a 
helper virus for efficient replication and a productive life cycle. (For a rev,ew, see 
Muzyczka et al., Curr. Topics in Micro, and Immunol. (1992) 158:97-129). It is also one 
of the few viruses that may integrate its DNA into non^iividing cells, and exhibits a h.gh 
frequency of stable integration (see for example Flotte et al., (1992) Am. J. Respir. Cell. 
Mol. Biol. 7:349-356; Samulski et al., (1989) J. Virol. 63:3822-3828; and McLaughhn et 
al (1989) J Virol. 62:1963-1973). Vectors containing as little as 300 base pairs of AAV 
can be packaged and can integrate. Space for exogenous DNA is limited to about 4.5 kb. 
An AAV vector such as that described in Tratschin et al., (1985) Mol. Cell. Biol. 5:325 1- 
3->60 can be used to introduce DNA into cells. A variety of nucleic acids have been 
introduced into different cell types using AAV vectors (see for example Hermonat et al., 
(1984) PNAS USA 81:6466-6470; Tratschin et al., (1985) Mol. Cell. Biol. 4:2072-2081; 
Wondisford et al, (1988) Mol. Endocrinol. 2:32-39; Tratschin et al., (1984) J. Virol. 
51:6 11 -6 19; and Flone et al., (1993) J. Biol. Chem. 268:3781-3790). 

Other viral vector systems that may have application in gene therapy have been 
derived from herpes virus, vaccinia virus, and several RNA viruses. In particular, herpes 
virus vectors may provide a unique strategy for persistence of the recombinant gene in 
cells of the central nervous system and ocular tissue (Pepose et al., (1994) Invest 
Ophthalmol Vis Sci 35:2662-2666) 

In addition to viral transfer methods, such as those illustrated above, non-viral 
methods can also be employed to cause expression of a protein in the tissue of an animal. 
Most nonviral methods of gene transfer rely on normal mechanisms used by mammalian 
cells for the uptake and intracellular transport of macromolecules. In preferred 
embodiments, non-viral gene delivery systems of the present invention rely on endocytic 
pathways for the uptake of the gene by the targeted cell. Exemplary gene delivery systems 
of this type include liposomal derived systems, poly-lysine conjugates, and artificial viral 
envelopes. 

In a representative embodiment, a gene encoding a composite activator can be 
entrapped in liposomes bearing positive charges on their surface (e.g., lipofectins) and 
(optionallv) which are tagged with antibodies against cell surface antigens of the target 
tissue (Mizuno et al., (1992) No Shinkei Geka 20:547-551; PCT publicauon WO91/06309; 
Japanese patent application 1047381; and European patent publication EP-A-43075). For 
example, lipofection of neuroglioma cells can be carried out using liposomes tagged with 
monoclonal antibodies against glioma-associated antigen (Mizuno et al., (1992) Neurol. 
35 Med. Chir. 32:873-876). 

In yet another illustrative embodiment, the gene delivery system comprises an 
antibody or cell surface ligand which is cross-linked with a gene binding agent such as 
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poly-lysine (see, for example, PCT publications WO93/04701, W092/22635, 
WO92/20316, W092/19749, and WO92/06180). For example, any of the subject gene 
constructs can be used to transfect specific cells in vivo using a soluble polynucleotide 
carrier comprising an antibody conjugated to a polycation, e.g. poly-lysine (see U.S. Patent 

5 5,166,320). It will also be appreciated that effective delivery of the subject nucleic acid 
constructs via -mediated endocytosis can be improved using agents which enhance escape 
of the gene from the endosomal structures. For instance, whole adenovirus or fusogenic 
peptides of the influenza HA gene product can be used as part of the delivery system to 
induce efficient disruption of DN A-containing endosomes (Mulligan et al., (1993) Science 

10 260-926; Wagner et al., (1992) PNAS USA 89:7934; and Christiano et al.. (1993) PNAS 
USA 90:2122). 

In clinical settings, the gene delivery systems can be introduced into a patient by 
any of a number of methods, each of which is familiar in the art. 

For instance, a pharmaceutical preparation of the gene delivery system can be 

15 introduced systemically, e.g. by intravenous injection, and specific transduction of the 
construct in the target cells occurs predominantly from specificity of transfection provided 
by the gene delivery vehicle, cell-type or tissue-type expression due to the transcriptional 
regulatory sequences controlling expression of the gene, or a combination thereof. In 
other embodiments, initial delivery of the recombinant gene is more limited with 

20 introduction into the animal being quite localized. For example, the gene delivery vehicle 
can be introduced by catheter (see U.S. Patent 5,328,470) or by stereotactic injection (e.g. 
Chen et al., (1994) PNAS USA 91: 3054-3057). 



TV. Target gene 

25 As used herein, the term "target gene" refers to a gene, whose transcription is 

stimulated according to the method of the invention. In a preferred embodiment, the gene 
is integrated in the chromosomal DNA of a cell. A cell comprising a target gene is 
referred to herein as a "target cell". 

In a preferred embodiment of the invention, the target gene is an endogenous gene. 

30 As used herein, the term "endogenous gene" refers to a gene which is naturally present in a 
cell, in its natural environment, i.e., not a gene which has been introduced into the cell by- 
genetic engineering. The endogenous gene can be any gene having a promoter that is 
recognized by at least one transcription factor. In a preferred embodiment, the promoter or 
any regulatory element thereof, of the endogenous gene ("endogenous promoter" and 

35 "endogenous regulatory element", respectively), is recognized by a known, preferably 
cloned. DNA binding protein, whether it is a transcriptional activator or repressor. 
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sequence of interest which provides a desired phenotype. It can encode a surface 
membrane protein, a secreted protein, a cytoplasmic protein, or there can be a plurality of 
target genes encoding different products. The target gene may be an antisense sequence 
which can modulate a particular pathway by inhibiting a transcriptional regulation protein 
5 or turn on a particular pathway by inhibiting the translation of an inhibitor of the pathway. 
The target gene can encode a ribozyme which may modulate a particular pathway by 
interfering, at the RNA level, with the expression of a relevant transcriptional regulator or 
with the expression of an inhibitor of a particular pathway. The proteins which are 
expressed, singly or in combination, can involve homing, cytotoxicity, proliferation, 
10 immune response, inflammatory response, clotting or dissolving of clots, hormonal 
regulation, etc. The proteins expressed may be naturally-occurring proteins, mutants of 
naturally-occurring proteins, unique sequences, or combinations thereof. 

Various secreted products include hormones, such as insulin, human growth 
hormone, glucagon, pituitary releasing factor. ACTH, melanotropin, relaxin, etc.; growth 
15 factors, such as EGF, IGF-1, TGF-a, -P, PDGF, G-CSF, M-CSF, GM-CSF, FGF. 
erythropoietin, thrombopoietin, megakaryocyte stimulating and growth factors, etc.: 
interleukins, such as IL-1 to -13; TNF-a and -p, etc.; and enzymes and other factors, such 
as tissue plasminogen activator, members of the complement cascade, perforins, 
superoxide dismutase, coagulation factors, antithrombin-III, Factor VIIlc, Factor VIIIvW, 
20 Factor IX, a-antitrypsin. protein C. protein S, endorphins, dynorphin, bone morphogenetic 
protein, etc. 

The gene can encode a naturally-occurring surface membrane protein or a protein 
made so by introduction of an appropriate signal peptide and transmembrane sequence. 
Various such proteins include homing receptors, e.g. L-selectin (Mel- 14), biood-related 

25 proteins, particularly having a kringle structure, e.g. Factor VIIlc, Factor VIIIvW, 
hematopoietic cell markers, e.g. CD3, CD4, CD8, B-cell receptor, TCR subunits a, p, y, 6, 
CD10, CD19, CD28, CD33, CD38. CD41, etc., receptors, such as the interleukin receptors 
IL-2R. IL-4R, etc., channel proteins for influx or efflux of ions, e.g. Ca+2, K+, Na^, Cl- 
and the like; CFTR, tyrosine activation motif, zap-70, etc. 

30 Proteins may be modified for transport to a vesicle for exocytosis. By adding the 

sequence from a protein which is directed to vesicles, where the sequence is modified 
proximal to one or the other terminus, or situated in an analogous position to the protein 
source, the modified protein will be directed to the Golgi apparatus for packaging in a 
vesicle. This process in conjunction with the presence of the chimeric proteins for 

35 exocytosis allows for rapid transfer of the proteins to the extracellular medium and a 
relatively high localized concentration. 

Also, intracellular proteins can be of interest, such as proteins in metabolic 
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oathwavs regulatory proteins, steroid receptors, transcription factors, etc.. depending upon 
the nature of One host cell. Some of the proteins indicated above can also serve as 
intracellular proteins. 

By way of further illustration, in T-cells, one may wish to introduce genes 
encoding one or both chains of a T-cell receptor. For B-cells, one could provide the heavy 
and light chains for an immunoglobulin for secretion. For cutaneous cells, e.g. 
keratinocytes, particularly stem cell keratinocytes, one could provide for protecuon agamst 
infection, by secreting a-, p- or ^-interferon, antichemotactic factors, proteases spectfic for 
bacterial cell wall proteins, etc. 

In addition to providing for expression of a gene having therapeutic value, there 
will be many situations where one may wish to direct a cell to a particular s.te. The s.te 
can include anatomical sites, such as lymph nodes, mucosal tissue, skin, synovmm, lung or 
other internal oreans or functional sixes, such as clots, injured sites, sites of surgical 
manipulation, inflammation, infection, etc. By providing for express^ of surface 
membrane proteins which will direct the host cell to the particular site by proving for 
bindine at the host target site to a naturally-occurring epitope, localized concentrauons of a 
secreted product can be achieved. Proteins of interest include homing receptors, e.g. L- 
selectin GMP140, CLAM-1, etc., or addressins, e.g. ELAM-1, PNAd, LNAd. etc., clot 
binding proteins, or cell surface proteins that respond to localized gradients of chemotacuc 
, factors. There are numerous situations where one would wish to direct cells to a parucular 
site, where release of a therapeutic product could be of great value. 

For use in gene therapy, the target gene can encode any gene product that is 
beneficial to a subiect. The gene product can be a secreted protein, a membraneous 
protein or a cvtoplasrmc protein. Preferred secreted proteins include growth factors 
5 differentiation factors, cytokines, interleukins, tPA, and erythropoietm. Preferred 
membraneous protems mclude receptors, e.g, growth factor or cytokine receptors or 
proteins mediating apoptos.s, e.g., Fas receptor. Other candidate therapeutic genes are 
disclosed in PCT/US93/01617. 

In yet another embodiment, a "gene activation" construct which, by homologous 
50 recombination with a genomic DNA, alters the transcriptional regulatory sequences of an 
endogenous gene, can be used to introduce recogmuon elements for a DNA binding 
activity of one of the subject engineered proteins. A vareity of different formats for the 
gene activation constructs are available. See. for example, the Transkaryotic Therap.es, Inc 
PCT publications WO93/09222, WO95/31560, WG96/29411, WO95/31560 and 
35 WO94/12650. 
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V. Methods of the invention 

The invention provides methods for stimulating transcription of a target gene, in 
particular, an integrated gene. The gene can be an endogenous gene, an exogenous gene, 
or a combination of both. The method of the invention comprises introducing into a target 

5 cell a multiplicity of transcriptional activation units linked to a DNA binding domain 
and/or a Iigand binding domain. If the transcriptional activation units are linked to a 
ligand binding domain, then the method further comprises introducing into the cell a DNA 
binding domain linked to at least one ligand binding domain and a ligand capable of 
interacting with the ligand binding domains, such that the transcriptional activation units 

10 and DNA binding domain are interacting, thereby stimulating transcription of a target 
gene. 

In another embodiment, the method of the invention comprises introducing into a 
target cell a multiplicity of ligand binding domains linked to a DNA binding domain 
together with at least one transcriptional activity unit linked to a ligand binding domain 
15 and a ligand which is capable of simultaneously binding to the two ligand binding 
domains. Thus, exposure of a target cell with an appropriate ligand will result in 
formation of a transcriptional activator complex comprising a multiplicity of activation 
units. Furthermore, since, as shown herein, the level of transcription of a target gene is 
directly correlated to the number of activation units that can be delivered to the target 
20 gene, the level of transcription of a target gene can be controlled by the level of ligand 
used to treat the target cell. 

In a preferred embodiment the method of the invention comprises introducing into 
a cell a target gene, a nucleic acid encoding a polypeptide having a multiplicity of 
activation units and a DNA binding domain or a ligand binding domain. If the polypeptide 
25 comprises a ligand binding domain, then the method further comprises introducing into the 
ceil a nucleic acid encoding a polypeptide comprising a DNA binding domain and a ligand 
binding domain. The nucleic acids and vectors comprising such can be prepared as 
described above. These can be introduced into a cell comprising a target gene, i.e., target 
cell, according to methods known in the art. In embodiments in which the target cell is in 
30 vitro, the nucleic acids can be introduced into the target cell by transfection involving 
electroporation, chemical transformation (e.g., calcium chloride), liposomes, or viral 
transformation. In embodiments in which the target cell is in vivo, the nucleic acids can 
be introduced into the cell by direct injection of naked DNA, liposomes, or any other 
method of administration of nucleic acids to a subject. 
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VT Kits 

This invention further provides kits useful for the foregoing applications. One 
such kit contains one or more nucleic acids encoding a transcriptional activator or subumts 
thereof The kit may further comprise an additional nucleic acid containing a target gene 
linked to a DNA sequence to which the transcriptional activator is capable of btndmg. 
Alternatively, the additional nucleic acid may contain a cloning site for insertion of a 
desired target gene bv the practitioner. For regulatable applications, i.e., in cases in which 
the recombinant protein contains a ligand binding domain or inducible domam. the kn may 
further contain an oligomerizing agent, such as the macrolide dimerizers discussed above. 
Such kits may for example contain a sample of a dimerizing agent capable of dimenzing 
the two recombinant proteins and activating transcription of the target gene. 



VTT F-cemplarv Uses 

The invention provides methods for stimulating transcription of a target gene at 
15 high levels, in particular of target genes integrated into chromosomal DNA. Strong 
transcriptional activators had not been prepared in the past, since it was known that potent 
transcriptional activators would induce squelching. However, as shown herein, 
transcription of an integrated gene can be stimulated at significantly higher levels than 
non-integrated eenes due to the absence of squelching on integrated genes. This 
20 observation is important in the context of certain gene therapy applications because it 
suggests that highly potent transcriptional activators can be used to drive therapeuuc gene 
expression to verv high levels without general toxicity to the cell and that efficacious 
levels of secreted therapeutic proteins may be attained with fewer engineered cells than 
previouslv thouaht. Furthermore, the invention provides a method for controlling the level 
25 of transcription, by controlling the amount of transcriptional activation units delivered to 
the target gene by use of varying amounts of the oligomerizer. 

In one embodiment, the invention is used to produce higher levels of a desired 
protein ex vivo. Production of recombinant therapeutic proteins for commercial and 
investigational purposes is often achieved through the use of mammalian cell lines 
30 engineered to express the protein at high level. The use of mammalian cells, rather than 
bacteria or veast. is indicated where the proper function of the protein requires post- 
translational modifications not generally performed by heterologous cells. Examples of 
proteins produced commercially this way include erythropoietin, tissue plasminogen 
activator, clotung factors such as Factor VIIIx, antibodies, etc. The cost of producing 
35 proteins in this fashion is directly related to the level of expression achieved m the 
engineered cells. Thus, because the invention described herein can achieve considerably 
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higher expression levels than conventional expression systems, it may greatly reduce the 
cost of protein production. A second limitation on the production of such proteins is 
toxicity to the host cell: Protein expression may prevent cells from growing to lugh 
density, sharply reducing production levels. Therefore, the ability to tightly control 
5 protein expression, as described for regulated gene therapy, permits cells to be grown to 
high density in the absence of protein production. Only after an optimum cell dens.ty is 
reached, is expression of the gene activated and the protein product subsequendy 
harvested. 

A similar problem is encountered in the construction and use of "packaging lines" 
10 for the production of recombinant viruses for commercial {e.g., gene therapy) and 
experimental use. These cell lines are engineered to produce viral proteins required for the 
assemblv of infectious viral panicles harboring defective recombinant genomes. Viral 
vectors that are dependent on such packaging lines include retrovirus, adenovirus, and 
adeno-associated virus. In the laner case, the titer of the virus stock obtained from a 
is packaging line is directly related to the level of production of the viral rep and core 
proteins.' But these proteins are highly toxic to the host cells. Therefore, it has proven 
difficult to generate high-titer recombinant viruses. This invention provides a solution to 
this problem, by allowing the construction of packaging lines in which the rep and core 
genes are placed under the control of regulatable transcription factors of the design 
20 described here. The packaging cell line can be grown to high density, infected with helper 
virus, and transfected with the recombinant viral genome. Then, expression of the viral 
proteins encoded by the packaging cells is induced by the addition of dimerizing agent to 
allow the production of virus at high titer. 

In other embodiments, the subject constructs are used as pan of a therapeutic 
25 treatment program from an animal. In one embodiment, the constructs of the invention are 
used to stimulate transcription of an endogenous gene of a subject. The endogenous gene 
can be any gene of the genome, increased expression of which is beneficial to a subject. 
For example, a subject may not produce sufficient amounts of a specific protein, due to a 
defect in a protein regulating the expression of the gene encoding the protein. In other 
30 embodiments, expression of a target gene is desired to compensate for the deficiency of 
expression of another gene. In yet other embodiments, the method of the invention is used 
to stimulate expression of an endogenous gene to compensate for a loss of the protein 
encoded by the endogenous gene. For example, the number of cells producing a specific 
secreted protein may be reduced in a subject, e.g, as the result of a disease or condition, 
35 thus resulting in reduced production of the specific factor in the subject. 

In a prefened embodiment, the method of the invention is used to stimulate 
production of a factor which is necessary for the proliferation and/or differentiation of one 
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administrations, the ability of the cells to multiply, the stability of the therapeutic agent, 
the physiologic need for the therapeutic agent, and the like. Generally, for myoblasts or 
fibroblasts for example, the number of cells will be at least about 104 and not more than 
about 109 and may be applied as a dispersion, generally being injected at or near the site 
of interest. The cells will usually be in a physiologically-acceptable medium. 

Cells engineered in accordance with this invention may also be encapsulated, e.g. 
using conventional biocompatible materials and methods, prior to implantation into the 
host organism or patient for the production of a therapeutic protein. See e.g. Hguyen et aL 
Tissue Implant Systems and Methods for Sustaining viable High Cell Densities within a 
Host US Patent No. 5,314,471 (Baxter International, Inc.); Uludag and Sefton, 1993, J 
Biomed. Mater. Res. 27<10):1213-24 (HepG2 eel ls/hydroxy ethyl methacrylate-methyl 
methacrylate membranes); Chang et al, 1993, Hum Gene Ther 4(4):433-40 (mouse Ltk- 
cells expressing hGH/immunoprotective perm- selective alginate microcapsules; Reddy et 
al, 1993, J Infect Dis 1 68(4): 1082-3 (alginate); Tai and Sun, 1993, FASEB J 7(1 1):1061-9 
15 (mouse fibroblasts expressing hGH/alginate-poIy-L-lysine-alginate membrane); Ao et al, 
1995, * Transplanataion Proc. 27(6):3349, 3350 (alginate); Rajorte et al, 1995, 
Transplantation Proc. 27(6):3389 (alginate); Lakey et al, 1995, Transplantation Proc. 
27(6):3266 (alginate); Korbutt et al, 1995, Transplantation Proc. 27(6):3212 (alginate); 
Dorian et al, US Patent No. 5,429,821 (alginate); Emerich et al, 1993. Exp Neurol 
20 122(l):37-47 (polymer-encapsulated PC12 cells); Sagen et al, 1993, J Neurosci 
13(6):24 15-23 (bovine chromaffin cells encapsulated in semipermeable polymer 
membrane and implanted into rat spinal subarachnoid space); Aebischer et al, 1994, Exp 
Neurol 126(2):151-8 (polymer-encapsulated rat PC12 cells implanted into monkeys; see 
also Aebischer, WO 92/19595); Savelkoul et al, 1994, J Immunol Methods 1 70(2): 185-96 
25 (encapsulated hybridomas producing antibodies; encapsulated transfected cell lines 
expressing various cytokines); Winn et al, 1994, PNAS USA 91(6):2324-8 (engineered 
BHK cells expressing human nerve growth factor encapsulated in an immunoisolation 
polymeric device and transplanted into rats); Emerich et al, 1994, Prog 
Neuropsychopharmacol Biol Psychiatry 18(5):935-46 (polymer-encapsulated PCI 2 cells 
30 implanted into rats); Kordower et al, 1994, PNAS USA 91 (23):1 0898-902 (polymer- 
encapsulated engineered BHK cells expressing hNGF implanted into monkeys) and Butler 
et al WO 95/04521 (encapsulated device). The cells may then be introduced in 
encapsulated form into an animal host, preferably a mammal and more preferably a human 
subject in need thereof. Preferably the encapsulating material is semipermeable, permitting 
35 release into the host of secreted proteins produced by the encapsulated cells. In many 
embodiments the semipermeable encapsulation renders the encapsulated cells 
immunologically isolated from the host organism in which the encapsulated cells are 
introduced. In those embodiments the cells to be encapsulated may express one or more 
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chimeric proteins containing component domains derived from proteins of the host species 
and/or from viral proteins or proteins from species other than the host species. For 
example in such cases the chimeras may contain elements derived from GAL4 and VP 16. 
The cells may be derived from one or more individuals other than the recipient and may be 

5 derived from a species other than that of the recipient organism or patient. 

Instead of ex vivo modification of the cells, in many situations one may wish to 
modify cells in vivo. For this purpose, various techniques have been developed for 
modification of target tissue and cells in vivo. A number of viral vectors have been 
developed, such as described above, which allow for transfection and, in some cases, 

10 integranon of the virus into the host. See, for example, Dubensky et al. (1984) Proc. Natl. 
Acad. Sci. USA 81, 7529-7533; Kaneda et al., (1989) Science 243,375-378; Hiebert et al. 
(1989) Proc. Natl. Acad. Sci. USA 86, 3594-3598; Hatzoglu et al. (1990) J. Biol. Chem. 
265, 17285-17293 and Ferry, et al. (1991) Proc. Natl. Acad. Sci. USA 88, 8377-8381. The 
vector may be administered by injection, e.g. intravascularly or intramuscularly, 

1 5 inhalation, or other parenteral mode. Non-viral delivery methods such as administration of 
the DNA via complexes with liposomes or by injection, catheter or biolistics may also be 
used. 

In accordance with in vivo genetic modification, the manner of the modification 
will depend on the nature of the tissue, the efficiency of cellular modification required, the 

20 number of opportunities to modify the particular cells, the accessibility of the tissue to the 
DNA composition to be introduced, and the like. By employing an attenuated or modified 
retrovirus carrying a target transcriptional initiation region, if desired, one can activate the 
virus using one of the subject transcription factor constructs, so that the virus may be 
produced and transfect adjacent cells. 

25 The DNA introduction need not result in integration in every case. In some 

situations, transient maintenance of the DNA introduced may be sufficient. In this way, 
one could have a short term effect, where cells could be introduced into the host and then 
turned on after a predetermined time, for example, after the cells have been able to home 
to a particular site. 

30 In another embodiment of the invention, the transcriptional activator of the 

invention recognizes a target endogenous gene, in which the promoter and/or one or more 
other regions of the gene has been modified to include a target sequence that is specifically 
recognized by the DNA binding domain of a known transcription factor and the 
transcriptional activator contains this DNA binding domain. Thus, the target endogenous 

35 gene is modified to be specifically recognized by a desired transcription factor. Such an 
embodiment can be useful in situations in which no DNA binding protein is known to 
specifically bind to a regulatory region of the target gene. Thus, in one embodiment, a cell 
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is obtained from a subject and the cell is genetically engineered in vitro to insert a desired 
regulatory sequence into the promoter of the target gene. The cell can then be further 
administered to the subject. Alternatively, prior to administration of the cell to the subject, 
the cell can further be modified to include a nucleic acid encoding a transcriptional 
5 activator comprising a DNA binding domain which is capable of interacting specifically 
with the regulatory element introduced into the target gene. In another embodiment, an 
endogenous gene is modified in vivo by, e.g., homologous recombination. 

Modification of a gene in a cell can be done, e.g., by homologous recombination, a 
technique well known in the an, and described, e.g., in Thomas and Capecchi (1987) Cell 
5 1 :503; Mansour et aT. (1988) Nature 336:348; and Joyner et al. (1989) Nature 338:153. 

In another embodiment, the transcriptional activator is used to stimulate 
transcription of an exogenous gene integrated into chromosomal DNA of a subject. An 
exogenous target gene can be introduced into a subject, by obtaining a cell from a subject, 
introducing the tareet gene and optionally a nucleic acid encoding a transcriptional 
activator into the cell and administering the cell to the subject. This embodiment is useful 
in situations in which in which no DNA binding protein is known to specifically bind to a 
regulatory region of the target gene or in situations in which the target gene encodes a 
protein which is not naturally produced by a cell. For example, the target gene can be a 
tumor antigen, which is not produced by the subject under normal conditions, but which 
one desires to express in the subject as a vaccine antigen to prevent development of a 
tumor expressing the tumor antigen. 

Exogenous genes can also encode antisense RNA or ribozymes or other RNA 
molecules which are not translated. For example, the method of the invention can be used 
to inhibit production of one or more specific proteins in a cell of a subject. The 
availability of potent transcriptional activators provided by the invention will ensure that 
high levels of RNA. e.g., antisense RNA, are produced in a cell. 

In a preferred embodiment of the invention, the transcriptional activator is a 
complex comprising a first fusion protein having multiple activation units and a ligand 
binding domain, a second fusion protein having a DNA binding domain and a ligand 
binding domain, and a ligand which interacts simultaneously with both ligand binding 
domains. Thus, activation of transcription of a target gene is stimulated only in the 
presence of the ligand, e.g., dimerizing agent. Accordingly, expression of the target gene 
in a subject is stimulated only upon administration of the ligand to the subject. 

The dimerizing ligand may be administered to the patient as desired to activate 
transcription of the target gene. Depending upon the binding affinity of the ligand, the 
response desired, the manner of administration, the half-life, the number of cells present. 
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various protocols may be employed. The ligand may be administered parenteral^ or 
orally. The number of administrations will depend upon the factors described above. The 
ligand may be taken orally as a pill, powder, or dispersion: bucally; sublingually; injected 
intravascularly, intraperitoneally, subcutaneously; by inhalation, or the like. The ligand 
5 (and monomeric antagonist compound) may be formulated using conventional methods 
and materials well known in the art for the various routes of administration. The precise 
dose and particular method of administration will depend upon the above factors and be 
determined by the attending physician or human or animal healthcare provider. For the 
most part, the manner of administration will be determined empirically. 
10 In the event that transcriptional activation by the ligand is to be reversed or 

terminated, a monomeric compound which can compete with the dimerizing ligand may be 
administered. Thus, in the case of an adverse reaction or the desire to terminate the 
therapeutic effect, an antagonist to the dimerizing agent can be administered in any 
convenient way, particularly intravascularly, if a rapid reversal is desired. Alternatively, 
15 one may provide for the presence of an inactivation domain (or transcriptional silencer) 
with a' DNA binding domain. In another approach, cells may be eliminated through 
apoptosis via signaling through Fas or TNF receptor as described elsewhere. See 
International Patent Applications PCT/US94/01617 and PCT/US94/08008. 

The particular dosage of the ligand for any application may be determined in 
20 accordance with the procedures used for therapeutic dosage monitoring, where 
maintenance of a particular level of expression is desired over an extended period of times, 
for example, greater than about two weeks, or where there is repetitive therapy, with 
individual or repeated doses of ligand over short periods of time, with extended intervals, 
for example, two weeks or more. A dose of the ligand within a predetermined range 
25 would be given and monitored for response, so as to obtain a time-expression level 
relationship, as well as observing therapeutic response. Depending on the levels observed 
during the time period and the therapeutic response, one could provide a larger or smaller 
dose the next time, following the response. This process would be iteratively repeated 
until one obtained a dosage within the therapeutic range. Where the ligand is chronically 
30 administered, once the maintenance dosage of the ligand is determined, one could then do 
assays at extended intervals to be assured that the cellular system is providing the 
appropriate response and level of the expression product. 

It should be appreciated that the system is subject to many variables, such as the 
cellular response to the ligand, the efficiency of expression and, as appropriate, the level of 
35 secretion, the activity of the expression product, the particular need of the patient, which 
may vary with time and circumstances, the rate of loss of the cellular activity as a result of 
loss of cells or expression activity of individual cells, and the like. Therefore 
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expected that for each individual patient, even if there were universal cells which could be 
administered to the population at large, each patient would be monitored for the proper 
dosage for the individual. 

In one embodiment of the invention, methods of introduction of an ligand, e.g., 

5 dimerizing agent, and/or a cell modified according to the method of the invention to 
synthesize high levels of protein from a target gene may be provided by rechargable or 
biodegradable devices. Various slow release polymeric devices have been developed and 
tested in vivo in recent years for the controlled delivery of drugs, including proteinacious 
biopharmaceuticals. A variety of biocompatible polymers (including hydrogels), 

10 including both biodegradable and non-degradable polymers, can be used to form an 
implant for the sustained release of a dimerizer or a protein produced by a cell modified 
according to the method of the invention at a particular target site. Such embodiments of 
the present invention can be used for the delivery of an exogenously purified protein 
produced according to the method of the invention, which has been incorporated in the 

15 polymeric device, or for the delivery of a protein produced by a cell encapsulated in the 
polymeric device. 

An essential feature of certain embodiments of the implant can be the linear release 
of the dimerizer or protein produced by the encapsulated cell which can be achieved 
through the manipulation of the polymer composition and form. By choice of monomer 
20 composition or polymerization technique, the amount of water, porosity and consequent 
permeability characteristics can be controlled. The selection of the shape, size, polymer, 
and method for implantation can be determined on an individual basis according to the 
disorder to be treated and the individual patient response. The generation of such implants 
is generally known in the art. See, for example, Concise Encylopedia of Medical & Dental 
25 Materials, ed. by David Williams (MIT Press: Cambridge, MA, 1 990); and the Sabel et al. 
U.S. Patent No. 4,883,666. In another embodiment of an implant, a source of cells, 
modified according to the method of the invention, producing a desired protein, or a 
solution of hydogel matrix containing purifed protein or dimerizer, is encapsulated in 
implantable hollow fibers. Such fibers can be pre-spun and subsequently loaded with the 
30 protein source or dimerizer (Aebischer et al. U.S. Patent No. 4,892,538; Aebischer et al. 
U.S. Patent No. 5,106,627; Hoffman et al. (1990) Expt. Neurobiol. 1 10:39-44: Jaeger et al. 
(1990) Prog. Brain Res. 82:41^»6; and Aebischer et al. (1991) J. Biomech. Eng. 1 13:178- 
183), or can be co-extruded with a polymer which acts to form a polymeric coat about the 
cell,' protein or dimerizer (L\m U.S. Patent No. 4,391,909; Sefton U.S. Patent No. 
35 4.353,888; Sugamori et al. (1989) Trans. Am. Artif. Intem. Organs 35:791-799; Sefton et 
al. (1987) Biotehnol. Bioeng. 29:1135-1143; and Aebischer et al. (1991) Biomaterials 
12:50-55). 
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Thus, the method of the invention can broadly be applied to any situation, e.g., for 
treating or preventing any disease or condition, in which transcriptional activation of an 
integrated target gene is desired. Depending on the specific embodiment of the invention, 
a transcriptional activator, nucleic acid encoding such, target exogenous gene, and/or 
5 oligomerizer is administered to a subject. These an be administered as such or together 
with a delivery vehicle, e.g., liposomes. Whether with or without a delivery vehicle, these 
compounds are preferably administered together with a pharmaceutical^ acceptable 
carrier. Methods of administration of these compounds are known in the art and are 
briefly disclosed below. 

10 Toxicity and therapeutic efficacy of such compounds can be determined by 

standard pharmaceutica] procedures in cell cultures or experimental animals, e.g., for 
determining the LDADVANCE \d 1 50ADVANCE \u 1 (the dose lethal to 50% of the 
population) and the EDADVANCE \d 1 50ADVANCE \u 1 (the dose therapeutically 
effective in 50% of the population). The dose ratio between toxic and therapeutic effects 
15 is the therapeutic index and it can be expressed as the ratio LDADVANCE \d 1 
50ADVANCE \u 1 /EDADVANCE \d 1 50ADVANCE \u 1 . Compounds which exhibit 
large therapeutic indices are preferred. While compounds that exhibit toxic side effects 
may be used, care should be taken to design a delivery system that targets such compounds 
to the site of affected tissue in order to minimize potential damage to uninfected cells and, 
20 thereby, reduce side effects. 

The data obtained from the cell culture assays and animal studies can be used in 
formulating a range of dosage for use in humans. The dosage of such compounds lies 
preferably within a range of circulating concentrations that include the EDADVANCE \d 1 
50ADVANCE \u 1 with little or no toxicity. The dosage may vary within this range 
25 depending upon the dosage form employed and the route of administration utilized. For 
any compound used in the method of the invention, the therapeutically effective dose can 
be estimated initially from cell culture assays. A dose may be formulated in animal 
models to achieve a circulating plasma concentration range that includes the 
1CADVANCE \d 1 50 AD VANCE \u 1 (i.e., the concentration of the test compound which 
30 achieves a half-maximal inhibition of symptoms) as determined in cell culture. Such 
information can be used to more accurately determine useful doses in humans. Levels in 
plasma may be measured, for example, by high performance liquid chromatography. 

Pharmaceutical compositions for use in accordance with the present invention may 
be formulated in conventional manner using one or more physiologically acceptable 
35 carriers or excipients. Thus, the compounds and their physiologically acceptable salts and 
solvates may be formulated for administration by, for example, injection, inhalation or 
insufflation (either through the mouth or the nose) or oral, buccal, parenteral or rectal 
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administration. 

For such therapy, the compounds of the invention can be formulated for a variety 
• of loads of administration, including systemic and topical or localized administration. 

Techniques and formulations generally may be found in Remmington's Pharmaceutical 
5 Sciences, Meade Publishing Co., Easton, PA. For systemic administration, injection is 
preferred, including intramuscular, intravenous, intraperitoneal, and subcutaneous. For 
injection,' the compounds of the invention can be formulated in liquid solutions, preferably 
in physiologically compatible buffers such as Hank's solution or Ringer's solution. In 
addition, the compounds may be formulated in solid form and redissolved or suspended 
l o immediately prior to use. Lyophilized forms are also included. 

For oral administration, the pharmaceutical compositions may take the form of. for 
example, tablets or capsules prepared by conventional means with pharmaceuticals 
acceptable excipients such as binding agents (e.g., pregelatinised maize starch, 
polyvinylpyrrolidone or hydroxypropyl methylcellulose); fillers (e.g., lactose. 
15 microcrystalline cellulose or calcium hydrogen phosphate); lubricants (e.g.. magnesium 
stearate, talc or silica); disintegrants (e.g., potato starch or sodium starch glycolate); or 
wetting agents (e.g.. sodium lauryl sulphate). The tablets may be coated by methods well 
known in the art. Liquid preparations for oral administration may take the form of. for 
example, solutions, syrups or suspensions, or they may be presented as a dry product for 
constitution with water or other suitable vehicle before use. Such liquid preparations may 
be prepared by conventional means with pharmaceutical ly acceptable additives such as 
suspending agents (e.g., sorbitol syrup, cellulose derivatives or hydrogenated edible fats); 
emulsifying agents (e.g., lecithin or acacia); non-aqueous vehicles (e.g., almond oil, oily 
esters, ethyl alcohol or fractionated vegetable oils); and preservatives (e.g., methyl or 
25 P ropyl-p-hydroxybenzoates or sorbic acid). The preparations may also contain buffer 
salts, flavoring, coloring and sweetening agents as appropriate. 

Preparations for oral administration may be suitably formulated to give controlled 
release of the active compound. For buccal administration the compositions may take the 
form of tablets or lozenges formulated in conventional manner. For administration by 
30 inhalation, the compounds for use according to the present invention are conveniently 
delivered in the form of an aerosol spray presentation from pressurized packs or a 
nebuliser, with the use of a suitable propellant, e.g., dichJorodifluoromethane. 
trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or other suitable gas. In 
the case of a pressurized aerosol the dosage unit may be determined by providing a valve 
35 to deliver a metered amount. Capsules and cartridges of e.g. gelatin for use in an inhaler 
or insufflator may be formulated containing a powder mix of the compound and a suitable 
powder base such as lactose or starch. 
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The compounds may be formulated for parenteral administration by injection, e.g., 
by bolus injection or continuous infusion. Formulations for injection may be presented in 
unit dosaee form, e.g., in ampoules or in multi-dose containers, with an added 
preservative. The compositions may take such forms as suspensions, solutions or 
5 emulsions in oily or aqueous vehicles, and may contain formulatory agents such as 
suspending, stabilizing and/or dispersing agents. Alternatively, the active ingredient may 
be in powder form for constitution with a suitable vehicle, e.g., sterile pyrogen-free water, 
before use. 

The compounds may also be formulated in rectal compositions such as 
, o suppositories or retention enemas, e.g., containing conventional suppository bases such as 
cocoa butter or other glycerides. 

In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular 
1 5 injection. Thus, for example, the compounds may be formulated with suitable polymeric 
or hvdrophobic materials (for example as an emulsion in an acceptable oil) or ion 
exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble 
salt. 

Systemic administration can also be by transmucosal or transdermal means. For 
20 transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. and 
include, for example, for transmucosal administration bile salts and fusidic acid 
derivatives. In addition, detergents may be used to facilitate permeation. Transmucosal 
administration may be through nasal sprays or using suppositories. For topical 
25 administration, the oligomers of the invention are formulated into ointments, salves, gels, 
or creams as generally known in the art. 

In clinical senings, the gene delivery systems for the genes encoding 
transcriptional activators and optionally target gene can be introduced into a patient by any 
of a number of methods, each of which is familiar in the art. For instance, a 
30 pharmaceutical preparation of the gene delivery system can be introduced systemically, 
e.g. by intravenous injection, and specific transduction of the protein in the target cells 
occurs predominantly from specificity of transfection provided by the gene delivery 
vehicle, cell-type or tissue-type expression due to the transcriptional regulatory sequences 
controlling expression of the receptor gene, or a combination thereof. In other 
35 embodiments, initial delivery of the recombinant gene is more limited with introduction 
into the animal being quite localized. For example, the gene delivery vehicle can be 
introduced by catheter (see U.S. Patent 5,328,470) or by stereotactic injection (e.g. Chen et 
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al. (1994) PNAS 91: 3054-3057). 

The pharmaceutical preparation of the gene therapy construct can consist 
essentially of the gene delivery system in an acceptable diluent, or can comprise a slow 
release matrix in which the gene delivery vehicle is imbedded. Alternatively, where the 
complete gene delivery system can be produced intact from recombinant cells, e.g. 
retroviral vectors, the pharmaceutical preparation can comprise one or more cells which 
produce the gene delivery system. 

The compositions may, if desired, be presented in a pack or dispenser device which 
mav contain one or more unit dosage forms containing the active ingredient. The pack 
may for example comprise metal or plastic foil, such as a blister pack. The pack or 
dispenser device may be accompanied by instructions for administration. 

The present invention is farther illustrated by the following examples which should 
not be construed as limiting in any way. The contents of all cited references including 
literature references, issued patents, published patent applications as cited throughout this 
,5 application are hereby expressly incorporated by reference. The practice of the present 
invention will employ, unless otherwise indicated, conventional techniques of cell biology, 
cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA. and 
immunology, which are within the skill of die art. Such techniques are explained fully in 
the literature. See, for example. Molecular Cloningt A Laboratoryt Manual, 2nd Ed., ed. by 
20 Sambrook. Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA 
Cloning, Volumes I and II (D. N. Glover ed.. 1985); Oligonucleotide Synthesis (M. J. Gait 
ed., 1984); Mullis et al. U.S. Patent No: 4,683,195; Nucleic Acid Hybridization (B. D. 
Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. 
Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss. Inc., 1987); 
Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To 
Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press. Inc., 
N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 
1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu 
et al' eds.). Immunochemical Methods In Cell And Molecular Biology (Mayer and 
Walker, eds.. Academic Press. London. 1987). Handbook Of Experimental Immunology, 
Volumes I-IV (D. M. Weir and C. C. Blackwell, eds., 1986); Manipulating the Mouse 
Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). 
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Exemplification 

35 The invention now being generally described, it will be more readily understood by 

reference to the following examples which are included merely for purposes of illustration 
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of certain aspects and embodiments of the present invention, and are not intended to limit 
the invention. 

Example 1 : Stimulation of High Level Transcription of an Integrated Gene 

5 This Example shows that a gene integrated into chromosomal DNA is immune 

from the effect of squelching and that high level transcription of an integrated gene can be 
obtained by delivering multiple activation domains to the gene. 

First, the effect of increasing concentrations of a transcriptional activator on an 
integrated versus a non-integrated reporter gene was compared as follows. Two constructs 
,0 encoding chimeric transcriptional activators under the control of the CMV 
enhancer/promoter were used. The first construct (pCG-GV) encodes a chimenc 
transcription factor composed of the yeast GAL4 DNA-binding domain (amino acids 1-94) 
and activation domain (amino acids 410-490) derived from the herpes simplex virus 
protein VP16 (Sadowski, I., et al. (1988) Nature 335:563-4). The second construct (pCG- 
15 CS) encodes a chimeric transcription factor composed of the yeast GAL4 DNA-binding 
domain (amino acids 1-94) and the activation domain (amino acids 361-550) from the NF- 
B P 65 protein (Ballard, D.W., et al. (1991) Proc. Natl. Acad. Sci 89:1875-1879; 
Schmitz, M.L. and Baeuerle. P.A. (1991) EMBO J. 10:3805-3817). The effect of these 
transcription factors was tested on a target gene composed of a secreted alkaline 
20 phosphatase (SEAP) reporter under the control of a minimal human IL-2 gene promoter 
flanked by five GAL4 binding sites. 

HT1080 cells were transiently transfected with the reporter construct and/or a 
construct encoding a chimeric transcription factor as follows. HT1080 cells were grown at 
37 C in MEM medium containing 10% fetal calf serum, non-essential amino acids and 
->5 penicillin-streptomvcin. Twenty-four hours before transfection. approximately 2X10= 
cells were seeded in each well in a 12-well plate. Cells were transfected using 
Lipofectamine as recommended (Gibco BRL). Cells in each well received the amount of 
plasmids indicated in the figure, with or without 400 ng of reporter plasmid. with the total 
amount of DNA being adjusted to 1.25 ug with pUC19. Five hours later, the medium was 
30 removed and 1 ml of fresh medium added. 1 8-24 hours later. 1 00 ul medium was removed 
and assayed for SEAP activity using a Luminescence Spectrometer (Perkin Elmer) at 350 
nm excitation and 450 nm emission. 

HT1080 cells were stably transfected with the reporter construct P LH-5xGAL4- 
IL2-SEAP as follows. The retroviral vector P LH-5xGal4-IL2-SEAP was constructed by 
35 cloning the 5xGAL4-lL2-SEAP fragment described above into the vector pLH ( Rivera, 
V.M., et al. (1996) Nature Medicine 2:1028-1032), which also contains the hygromycin B 
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resistance eene driven bv the Moloney murine leukemia virus long terminal repeat. 
Helper-free" retrovirus, generated as described (Rivera, V.M., et al. Nature Medicine 
2 1028-1032 (1996)) was used to infect HT1080 cells. Hundreds of hygromycin B (300 
Mg /ml) resistant clones were pooled (HT1 080 B pool) and individual clones screened by 
5 transient transection with pCG-GS. The most responsive clone, HT1080B. was selected 
for further analysis. 

When the reporter gene and expression plasmids encoding the transcription factors 
were both introduced transiently into HT1080 human fibrosarcoma cells, peak reporter 
gene expression at relatively low levels of input activator plasmid was observed, while 
10 higher levels of activator led to a sharp reduction in reporter gene expression (Fig. 1 A). 
This inhibition of gene expression in the presence of high levels of activator has been 
widely observed, and the phenomenon has been termed "squelching." On the contrary, 
expression of the integrated reporter gene was not inhibited by levels of activator that 
sharply squelched a transiently transfected reporter (Fig. IB). 
15 .To rule out the possibility that the immunity of the integrated gene to squelching 

reflects the effect of a single unusual integration site, this experiment was repeated using a 
pool of several hundred HT1080-derived clones harboring independent reporter gene 
insertions. As shown in Figure 1C, expression of this population of inserted reporter genes 
was not inhibited by high levels of activator. A second reporter construct in which the 
20 same GAL4-driven 1L-2 promoter was fused to a human growth hormone (hGH) reporter 
gene was constructed by replacing the SEAP gene with the hGH gene. This plasmid was 
cotransfected with the GAL4-p65 expression plasmid into cells containing an integrated 
SEAP reporter gene, thus allowing both reporter genes to be assayed in the same cell 
population. Transfections and reporter gene expression were measured as described 
25 above. For determining expression of the hGH reporter construct 2-5 ul of medium was 
assayed for hGH protein as recommended (Nichols Diagnostic). 

If, in the presence of a high-copy episomal template, GAL4-p65 titrates GTFs 
necessarv for the activity of this promoter, then one would expect to observe inhibition of 
both the episomal hGH gene and the integrated SEAP gene. However, the results indicate 
30 that, whereas expression of the episomal hGH gene was inhibited at high activator 
concentrations, the integrated SEAP gene responded identically whether or not the 
episomal gene was present (Fig. 2). Similar results were obtained in parallel experiments 
with GAL4-VP16. This observation indicates that one or more mechanistic steps in the 
transcription cycle differ between episomal and integrated genes. 
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Number of Activation Domains Delivered to the Gene 

This Example shows that expression of an integrated gene can be driven to very 
high levels by delivering many potent activation domains to the promoter of the gene. 

A modular strategy was designed to deliver different numbers of activation 
5 domains to a single integrated target gene. This strategy was based on the abthty oi a 
small-molecule "dimerizer" to recruit activation domains to a DNA-bound receptor 
(BelshawPJ et al. (1996) Proc. Natl. Acad. Sci 93:4604-4607; Rivera, V.M., et al. 
(1996) Nature Medicine 2:1028-1032; Ho. S.N., et al. (1996) Nature 382:822-824). The 
basic system, which is diagrammed in Fig. 3A, was composed of a GAL4 DNA-bmdmg 
,0 domain fused to a single copy of human FKBP12 and a p65 activation domain fused to the 
FRB domain of FRAP (Ho, S.N.. et al. , supra; Sabatini, D.M., et al. (1994) Cell 78:35- 
43) In the presence of the natural-product immunosuppressive compound rapamycin, the 
FRB-p65 fusion protein is efficiently recruited to the GAL4-FKBP fusion protein. This 
basic system results in the delivery of a maximum of one p65 activation domain per GAL4 
monomer. The number of deliverable activation domains was increased in wo ways: (i) 
by increasing the number of FKBP moieties fused to GAL4, as indicated in Fig. 3B; and 
(ii) through the use of a tetramerization domain derived from the E. coli lac repressor 
(Chakerian, A.E., et al. (1991) J Biol Chem 266:1371-4; Alberta, S., et al. (1993) EMBO 
j V- 3227-36: Lewis, M., et al. (1996) Nature 271:1247-1254) to deliver "bundles" of 
four activation domains to each FKBP moiety, as shown in Fig. 3C. These different 
configurations allowed recruitment of up to sixteen p65 activation domains to a single 
GAL4 monomer. 

These expression constructs were prepared as follows. pCGNN-G expression 
vector was made by inserting a PCR fragment containing the GAL4 DNA binding domain 
25 (amino acids 1-94) flanked by upstream Xbal and downstream Spel and BamHI sites into 
Xbal- and BamHI- digested pCGNN (Ricardo Anar, please provide a reference). FKBP12 

coding sequence (amino acids 1-107), described in flanked by upstream Xbal and 

downstream Spel and BamHI sites was inserted between the Spel and BamHI sites of 
pCGNN-G to generate pCGNN-GF 1 . Plasmids pCGNN-GF2, GF3 and GF4 were made 
30 by the sequential insertion of FKBPI2 coding sequences into Spel- and BamHI digested 
pCGNN-GF 1 , 2 and 3 plasmids, respectively. PCGNN -RL was constructed by cloning an 
Xbal-BamHI fragment containing a portion of the E. coli lactose repressor (amino acids 
46-360) into Spel- and BamHl-digested pCGNN-R (Rivera, V.M., et al. supra) to .fuse ,t to 
the carboxv terminus of the FRB domain. The P 65 activation domain was fused to the 
35 carboxy terminus of this chimera by inserting an Xbal-BamHI fragment into Spel- and 
BamHI digested pCGNN-RL to create pCGNN-RLS. 

Various combinations of expression constructs, i.e., 10 ng of plasmid expressing 
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DNA binding domain fusions and increasing amounts of plasmid expressing p65 
activation domain fusions) were transfected into HT1080B cells carrying an integrated 
reporter gene. The cells were then treated with 10 nM rapamycin to deliver the activation 
domains to the target gene and the amount of SEAP was measured in the supernatant. As 
5 shown in Figure 4B and 4C, the results indicate a roughly linear increase in reporter gene 
expression as a function of the number of activation domains that are delivered to a single 
GAL4 monomer. In contrast, except for the increment from one to two activation domains 
per GAL4 monomer, increasing the number of activation domains that can be delivered to 
an episomal target gene (introduced by transient transfection) failed to enhance the peak 
level of gene expression achieved before squelching occurred (Figures 4B and 4C). 

These results indicate that expression of the episomal reporter gene can be driven 
only to a fixed level before squelching sets in, effectively establishing a ceiling for 
expression from the episomal gene. Because squelching does not impose such a limitation 
on the integrated gene, expression can be substantially augmented by delivering more 
activation domains, indicating that expression of an integrated chromatin-embedded gene 
is limited largely by the number (or strength) of activation domains bound at the promoter. 
Furthermore, the level of expression reached by the integrated gene is considerably higher 
than that attained by the episomal gene, especially when considered on a per-template 
basis, as shown in Figure 4C. 
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Example 3: Transcriptional Activation Units can Synergize to Activate 
Transcription of an Integrated Target Gene 

This Example shows that specific combinations of transcriptional activation 
domains can result in potent transcriptional activators capable of strong transactivation of 
25 integrated single copy genes. 

The effect of various constructs encoding GAL4 DNA binding domain fusion 
proteins on transcription of an integrated P LH-5xGal4-IL2-SEAP vector in HT1080 cells 
(clone HT1080B) was determined. The activation domains fused to the GAL4 DNA 
binding domain (amino acids 1-94) consisted of amino acids 18x4 of QUI (GQIII), amino 
30 acids 1-92 of p53 (Gp53), amino acids 41 7-490 of VP16 (GVP16), amino acids 450-550 
of P 65 (Gp65), amino acids 399-499 of CTF (GCTF), amino acids 411-508 of SRF 
(GSRF), or amino acids 263-499 of SP1 (GSP1). 

Various amounts of these constructs were transiently transfected into the HT1080B 
cell line having a single integrated copy of the vector P LH-5xGal4-IL2-SEAP and the 
35 level of expression of the reporter gene was measured as described above. The results, 
which are represented in Figure 6. indicate that among the various fusion proteins, only 
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GAL 4-VP16 and GAL4-p65 activate transcription of the integrated single copy reporter 
gene. 

Next the transaction potential of combinations of transcriptional activat.on 
domains were tested on a single copy integrated gene. Expression constructs encoding a 
GAL4 DNA binding domain linked to 1, 2, or 4 VP16 activation domains (ammo adds 
417-490) or 1 2 3, or 4 P 65 activation domains (amino acids 450-550) or a p65 activation 
domain fused to a VP16 activation domain. The results are presented in Figure 7 and 
indicate that a multiplicity of P 65 activation domains or VP16 activation domains do not 
provide higher levels of transcription relative to a transcriptional activator containing a 
single P 65 or VP16 activation domain. In contrast, the activation domains of P 65 and 
VP16 are capable of cooperating in transcriptional activation of an integrated target gene. 

In another experiment, the transcriptional potential of combinations of 
transcriptional activat.on domains of VP16 on a single copy integrated gene was 
determined Expression constructs encoding the following fusion proteins containing a 
, GAL4 DNA binding domain (amino acids 1-94) were prepared and tested m transient 
transfections of the HT1080 cell clone containing a single integrated 
IL 7-SE*P- fusion proteins containing 2, 4. 8. or 12 copies (GV8X2. GV8X4, GV8X8, and 
GV8X12 respectively) of an eight ammo acid variant of a transaction domain from 
VP16 having the amino acid sequence DFDLDMLG (referred to herein as "V8" peptide). 
3 proteL containing 1, 2, 3. 4. 5. or 6 copies (GVCX1, GVCX2, GVCX3 , OVO£ 

GVCX5 GVCX6, respectively) of a transactivation domain from the C-termmus of VP16 
corresponding* ammo acids 461-490 (referred to herein as "Vc peptide); and fusion 
proteins containing 8 cop.es of V8 and 5 copies of Vc. wherein the V8 cop.es are fused 
either to the N -terminus or the C-terminus of Vc. The results, which are presented in 
5 Figure 8, panels A and B, indicate cooperativity between V8 peptides up to about 8 copies 
of V8 and up to 5 copies of Vc. Higher numbers of copies of V8 and Vc result in reduced 
transcriptional activity of the reporter gene. However, as shown in Figure 8, panel C, a 
combination of 8 copies of V8 fused to the N-terminus of 5 copies of Vc, results ,n 
stronger transcriptional levels than that obtained with either 8 copies of V8 or 5 copies of 
30 Vc This transcriptional level is even higher than that obtained with full length VP16. 
Interesting* however, a fusion protein containing 8 copies of V8 fused to the C-term»nus 
of 5 copies of Vc does not result in higher transcriptional activity of the reporter 
constructs, relative to 8 copies of V8 and 5 copies of Vc. Thus, specific arrangements of 
transcriptional activation domains are more potent in activating transcript.on than other 
35 arrangements. 

Thus this example demonstrates that combination of multiple activation domains 
can result in potent transcriptional activators capable of stimulating transcription of a 
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Example 4: An Alanine/Proline Rich Domain of P 65 is Capable of Strongly 
Transacting a Target Gene, when in the Presence of Another Transactivat.cn 
Domain 

This Example demonstrates that a specific domain of g P 65 is incapable to stimulate 
transcription of a target gene on its own. but can synergize with other transcripuonal 
activation domains to stimulate transcription of a target gene. 

Analysis of the P 65 amino acid sequence revealed the presence of two 
Alanine/Proline rich regions in p65. The first domain, i.e., domain I, consists of amino 
acids 376-401 and has the amino acid sequence SALALAPAPPQ VLPQAPAPAPAPAMV 
(SEQ ID NO ) The second domain, i.e., domain II, consists of amino acids 402-427 and 
has the amino acid sequence SALAQAPAPVPVLAPGPPQAVAPPAP (SEQ ID NO...). 
The transcriptional potential of these sequences was investigated as follows. Constructs 
encoding fusion proteins containing GAL 4 DNA binding domain (amino acids 1-92) and 
the following domains were prepared for use in transient transfection: fusion proteins 
containing 1, 2, or 3 copies of amino acids 361-450 of P 65 containing the two above- 
described alanine/proline rich regions and referred to herein as the AP domain; fusion 
proteins containine 1, 2, or 4 copies of the SP1 activation domain; and fusion proteins 
containing an SP1 activation domain fused at its N-terminal or C-terminal region to one or 
two AP domains. 

The amount of reporter gene expression obtained upon transfection of these 
constructs into the HT1080B cell line containing an integrated copy of the vector pLH- 
5xGal4-IL2-SEAP is shown in Figures 9A and B. These results indicate that 2 and 4 
copies of the AP domain does not result in significant transactivation of the reporter gene, 
in contrast to an increase in transcription proportional to the number of Spl domains added 
(Figure 9A). However, the combination of one or more AP domains with an Spl 
activation domain synerg.stically activated transcription of the reporter construct (Figure 
9B) Interestingly, the synergy appears to be stronger when the AP domain is linked to the 
30 N-terminus of the Spl activation domain. Thus, a combination of transcriptional 
activation domains results in transcriptional activators which can be much more potent 
than naturally occurring transcription factors. 

Yet other combinations of transactivauon domains were tested for their effect on 
transcription of an integrated single copy gene. The fusion proteins tested comprised a 
35 GAL4 DNA binding domain and one or more V8 domains, one or more Vc domains, 
and/or one or more AP domains. Expression constructs encoding these fusion proteins 
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were transiently transfected into the HT1080 cell line containing an integrated copy of the 
vector P LH-5xGal4-IL2-SEAP. The results are shown in Figures 10A and 10B. These 
results show that the AP domain potentiates the transcriptional activation of vanous 
combinations of transcriptional activation domains. However, the AP domain has a 
5 stronger potentiating effect on some activation domains than on others. In particular, as 
shown in Figure 10A, AP synergizes with V8, but increases only slightly transcription by 
Vc While not wishing to be bound by any particular theory, we note that one possible 
explanation for this phenomenon is that both the AP domain and Vc interact with the same 
GTF i e. TFI1A, whereas V8 interacts with a different GTF. Thus, it is possible that 
,0 svnergy between different transcriptional activation domains is induced when the 
activation domains interact with different GTFs. Figure 10A further shows that certain 
combinations of activation domains, such as AP-V8x8-Vcx2 and AP-V8x8-AP-Vcx2, are 
capable of stimulating transcription of the reporter gene to higher levels than VP 16 and 
even p65, which is one of the most potent naturally occurring transcription factors known. 
, 5 Figure 1 OB shows the results of transfections with yet other fusion proteins having 

various combination of transactivating domains. In particular. Figure 10B shows that the 
transcriptional activity of a transcriptional activation domain of P 65, corresponding to 
amino acids 361-550 (containing the AP domain) is further potentiated by the addition of 
one or more other transactivation domains, such as an additional AP domain and/or V8 or 
20 Vc domains. 

In yet another set of transfections, constructs encoding fusion proteins having 
activation domains fused to three copies of the ligand binding domain FKBP, DNA 
binding domains fused to ligand binding domains (FKBP or FRB in a single or muthple 
copies) were used. In these transfections, transcription was induced by addition of a 
-5 ligand, e.g.. raparaycin, FK1012, AP1510, or other synthetic dimerizer (depending on the 
ligand binding domain). The results obtained were similar to those obtained with the 
covaient system. 

Thus, this Example shows that potent transcriptional activators can be created by 
the combination of various transcriptional activation domains and that these transcriptional 
30 activators stimulate transcription of an integrated single copy gene. 
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All of the above-cited references and publications are hereby incorporated by 
reference. 



Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, numerous equivalents to the specific polypeptides, nucleic acids, 
methods, assays and reagents described herein. Such equivalents are considered to be 
within the scope of this invention. 
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SEQUENCE LISTING 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2444 base pairs 
<B) TYPE: nucleic acid 
(C> STRANDEDNESS: bo^h 
, (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



15 (ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 39.. 164 9 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GGCACGAGGC GGGGCCGGGT CGCAGCTGGG CCCGCGGC ATG GAC GAA CTG TTC 

1 => 

25 CCC CTC ATC TTC CCG GCA GAG CAG CCC AAG CAG CGG GGC ATG CGC TTC 101 
Pro Leu lie Phe Pro Ala Clu Gin Pro Lys Gin Arg Gly Met Arg Phe 



30 



35 



50 



10 



CGC TAC AAG TGC GAG GGG CGC TCC GCG GGC AGC ATC CCA GGC GAG AGG 
Arg Tyr Lys Cys Clu Gly Arg Ser Ala Gly Ser lie Pro Gly Glu Arg 



25 



TTP "*AT GAG GCT L>rt<o <~ 1 t ^-^^ ^>^— * - _ _ . 

Phe Tyr Glu Ala Glu Leu Cys Pro Asp Arg Cys lie His Ser Phe Gin 



90 



AAC CTG GGA ATC CAG TGT GTG AAG AAG CGG GAC CTG GAG CAG GCT ATC 
Asn Leu Gly lie Glr. Cys Val Lys Lys Arg Asp Leu Glu Gin Ala lie 



105 



ACT CAG CGC ATC CAG ACC AAC AAC AAC CCC TTC CAA GTT CCT ATA GAA 
55 Ser Gin Arg lie Glr. Thr Asn Asn Asn Pro Phe Gin Val Pro _le Glu 



120 



60 135 



2S Sel k£ Pro lie Pne Zn Ar, Ala Vro Asn Thr AX. CI- Leu 



170 



70 



AAG ATC TGC CGA GTG AAC CGA AAC TCT GGC AGC TGC CTC GGT GGG GAT 
Lys He Cys Arg Val Asn Arg Asn Ser Gly Ser Cys Leu Gly Gly Asp 



53 



14S 



197 



AGC ACA GAT ACC ACC AAG ACC CAC CCC ACC ATC AAG ATC AAT GGC TAC 
Se^ Thr Asp Thr ?hr Lys Thr His Pro Thr lie Lys lie Asn Gly Tyr 
40 45 ° 

ACA GGA CCA GGG ACA GTG CGC ATC TCC CTG GTC ACC AAG GAC CCT CCT 
?h~ Gly Pro Gly Thr Val Arg lie Ser Leu Val Thr Lys Asp Pro Pro 
40 55 60 65 

CAC CGG CCT CAC CCC CAC GAG CTT GTA GGA AAG GAC TGC CGG GAT GGC 

£s Arc Pro His Pro His Glu Leu Val Gly Lys As P Cys Arg Asp Gly 
70 75 

45 TTC TAT GAG GCT GAG CTC TGC CCG GAC CGC TGC ATC CAC ACT TTC CAG 



245 



293 



389 



437 



GAG CAG CGT GGG GAC TAC GAC CTG AAT GCT GTG CGG CTC TGC TTC CAG 46f 
Glu Gin Arc Gly Asp Tyr Asp Leu Asn Ala Val Arg .eu Cys Phe Gin 

14C I 45 



533 



GTG ACA GTG CGG GAC CCA TCA GGC AGG CCC CTC CGC CTG CCG CCT GTC 

Val Thr Val Arg Asp Pro Ser Gly Arg Pro Leu Arg Leu Pro . ro Val 

150 1*5 160 

" CTT TCT CAT CCC ATC TTT GAC AAT CGT GCC CCC AAC ACT GCC GAG CTC 
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185 190 195 

r*- ATC TTC CTA CTG TGT GAC AAG GTG CAG AAA GAG GAC ATT GAG GTG 677 
Glu lie Phe Su Leu Cys Asp Lys Val Gin Lys Glu Asp He Glu VI 



15 



200 



-r-rr nrr GGA CCA GGC TGG GAG GCC CGA GGC TCC TTT TCG CAA GCT 

lyr III Thr Gly Pr^ Gly Trp Glu Ala Arg Gly Ser Phe Ser Gin Ala 

215 220 

GAT GTG CAC CGA CAA GTG GCC ATT GTG TTC CGG ACC CCT CCC TAC GCA 

£p vll Kis Arg Gin Val Ala lie Val Phe Arg Thr Pro Pro Tyr Ala 
230 235 

CAC CCC AGC CTG CAG GCT CCT GTG CGT GTC TCC ATG CAG CTG CGG CGG 

As"p Pro Ser 25 Gin Ala Pro Val Arc Val Ser Met Gin Leu Arg Arg 

CCT TCC GAC CGG GAG CTC AGT GAG CCC ATG GAA TTC CAG TAC CTG CCA 
20 Pro Ser Asp Arg Glu Leu Ser Glu Pro Met Glu Phe Gin Tyr Leu Pro 

265 2,0 



GAT ACA GAC GAT CGT CAC CGG ATT GAG GAG AAA CGT AAA AGG ACA TAT 
Asp Thr Asp Asp Arc His Arc lie Glu Glu Lys Ar, Lys Arg Thr Tyr 



GCT TCT GTC CCC AAG CCA GCA CCC CAG CCC TAT CCC TTT ACG TCA TCC 
All s" vll ?ro Lys Pro Ala Pro Gin Pro Tyr Pro Phe Thr Ser Ser 
330 33= -" u 

CTG AGC ACC ATC AAC TAT GAT GAG TTT CCC ACC ATG GTG TTT CCT TCT 
40 25 s5r Thr lie Asn Tyr Asp Glu Phe Pro Thr Met Val Phe Pro Ser 
345 350 - JD - > 



35 



45 



GGG CAG ATC AGC CAG GCC TCG GCC TTG GCC CCG GCC CCT CCC CAA GTC 
Gly Gin He Ser Gin Ala Ser Ala Leu Ala Pro Ala Pro Pro Gin Val 
360 365 3,0 



55 



60 



CAG GCT GTG GCC CCA CCT GCC CCC AAG CCC ACC CAG GCT GGG GAA GGA 
Gin Ala Val Ala Pro Pro Ala Pro Lys Pro Thr Gin Ala Gly Glu Gly 
410 < ls 

ACG CTG TCA GAG GCC CTG CTG CAG CTG CAG TTT GAT GAT GAA GAC CTG 
Thr Leu Ser Glu Ala Leu Leu Gin Leu Gin Phe Asp Asp Glu Asp .eu 
425 4 ^0 



725 



821 



669 



917 



965 



25 280 

GAG ACC TTC AAG AGC ATC ATG AAG AAG AGT CCT TTC AGC GGA CCC ACC 

Glu Thr Phe Lys Ser lie Me, Lys Lys Ser Pro Phe Ser Gly Pro Thr 

295 300 30b 

30 GAC CCC CGG CCT CCA CCT CGA CGC ATT GCT GTG CCT TCC CGC AGC TCA 1012 

Asp Pro Pro Pro Pro Arc Arc He Ala Val Pro Ser Arg Ser Ser 

310 315 



1061 



1109 



1157 



1205 



CTG CCC CAG GCT CCA GCC CCT GCC CCT GCT CCA GCC ATG GTA TCA GCT 
llu Pro Gin Ala Pro Ala Pro Ala Pro Ala Pro Ala Me, Val Ser Ala 
3 7 5 380 3B5 

50 CTG GCC CAG GCC CCA GCC CCT GTC CCA GTC CTA GCC CCA GGC CCT CCT 12 S3 

25 Ala Gin Ala Pro Ala Pro Val Pro Val Leu Ala Pro Gly Pro Pro 
390 395 <°0 4US 



1301 



1349 



GGG GCC TTG CTT GGC AAC AGC ACA GAC CCA GCT GTG TTC ACA GAC CTG 1-S1 
ITy Ma 25 Leu Gly Asn Ser Thr Asp Pro Ala Val Phe Thr Asp ~eu 
65 KO * 30 



1445 



GCA TCC GTC GAC AAC TCC GAG TTT CAG CAG CTG CTG AAC CAG GGC ATA 
Ala Ser Val Asr Asn Ser Glu Phe Gin Gin Leu Leu Asn Gin Gly He 
, 5S 460 <65 

70 CCT GTG GCC CCC CAC ACA ACT GAG CCC ATG CTG ATG GAG TAC CCT GAG 14 93 
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30 



40 



P^o val Ala Pro' His Thr Thr Glu Pro Met Leu Met Glu Tyr Pre Glu 
4 7 5 4B0 



470 



rCT ATA ACT CGC CTA GTG ACA GCC CAG AGG CCC CCC GAC CCA G^ CCT 
S Ue Thr Arg Leu Val Thr Ala Gin Arg Pro Pro Asp Pro A,a Pro 
490 «" 



GCT CCA CTG GGG GCC CCG GGG CTC CCC AAT GGC CTC CTT TCA GGA GAT 
SS Pro Leu Gly Ala Pro Cly Leu Pro Asn Gly Leu Leu Ser Gxy Asp 
505 510 



GAA GAC TC TCC TCC ATT GCG GAC ATG GAC TTC TCA GCC CTG CTG ACT 
Glu Asp ill III Ser He Ala Asp Met Asp Phe Ser Ala Leu Leu Ser 



45 



50 



55 



AGCTGTTGCC AAGCTGGACG GC AC GAG CTC GTGCC 

(2 3 INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 527 amino acids 
(B) TYPE: acas.no acid 
(D) TOPOLOGY: linear 

(ill MOLECULE TYPE: protein 

[x i) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
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Met Asp Glu Leu Phe Pro Leu He Phe Pro Ala Glu Gin Pro Lys Gin 
c 10 x ~ 

Arg Gly Met Arg Phe Arg Tyr Lys Cys Glu Gly Arg Ser Ala Gly Ser 
2C 25 - u 

lie Pro Glv Glu Arg Ser Thr Asp Thr Thr Lys Thr Kis Pro Thr He 
65 35 40 4 ~ 

Lys He Asn Gly Tyr Thr Gly Pro Gly Thr Val Arg He Ser Leu Val 
50 ^5 60 

70 Thr Lys Asp Pre Pre His Arg Pro His Pro Kis Glu Leu Val Gly Lys 
65 70 5 



1541 



1589 



163" 



520 

15 CAG ATC AGC TCC TAAGGGGGTG ACGCCTGCCC TCCCCAGAGC ACTGGTTGCA 168 9 

Gin He Ser Ser 
525 

GGGGATTGAA GCCCTCCAAA AGCACTTACG GATTCTGGTG GGGTGTGTTC CAACTGCCCC 
CAACTTTGTG GATGTCTTCC TTGGAGGGGG GAG C CAT ATT TTATTCTTTT ATTGTCAGTA 
TCTGTATCTC TCTCTCTTTT TGGAGGTGCT TAAGCAGAAG CATTAACTTC TCTGGAAAGG 
GGGGAGCTGG GGAAACTCAA ACTTTTCCCC TGTCCTGATG GTCAGCTCCC TTCTCTGTAG 
GGAACTGTGG GGTCCCCCAT CCCCATCCTC CAGCTTCTGG TACT CTC CTA GAGACAGAAG 
CAGGCTGGAG GTAAGGCCTT TGAGCCCACA AAGCCTTATC AAGTGTCTTC CATCATGGAT 
T CAT T AC AGC TTAATCAAAA TAACGCCCCA GATACCAGCC CCTGTATGGC ACTGGCATTG 
TCCCTGTGCC TAACACCAGC GTTTGAGGGG CTGCCTTCCT GCCCTACAGA GGTCTCTGCC 
35 GGCTCTTTCC TTGCTCAACC ATGGCTGAAG GAAACAGTGC AACAGCACTG GCTCTCTCCA 212 9 
GG AT C C AG AA GGGGTTTGGT CTGGACTTCC TTGCTCTCCC CTCTTCTCAA GTGCCTTAAT 2289 
AGTAGGGTAA GTTGTTAAGA GTGGGGGAGA GCAGGCTGGC AGCTCTCCAG TCAGGAGGCA 2 34 9 
TAGTTTTTAG TGAACAATCA AAGCACTTGG ACTCTTGCTC TTTCTACTCT GAACTAATAA 2409 
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Asp Cys Arg Asp Gly ?he Tyr Glu Ala Glu Leu Cys Pro Asp Arg Cys 

r 

5 He His Ser ?he Gin Asn Leu Gly lie Gin Cys Val Lys Lys Arg As? 
100 105 

Leu Glu Gin Ala lie Ser Gin Arg lie Gin Thr Asn Asn Asn Pro Pne 
115 120 12d 

10 Gin Val Pro lie Glu Glu Gin Arg Gly Asp Tyr Asp Leu Asn Ala Val 
130 135 140 



15 145 



Ara Leu Cys Phe Gin Val Thr Val Arg Asp Pro Ser Gly Arg Pro Leu 
14 5 150 155 loo 

Ara Leu Pro Pro Val Leu Ser His Pro He Phe Asp Asn Arg Ala Pro 
165 l" 70 1,3 

20 Asn Thr Ala Glu Leu Lys lie Cys Arg Val Asn Arg Asn Ser Gly Ser 
180 I 85 i9U 

Cys Leu Gly Gly Asp Glu He Phe Leu Leu Cys Asp Lys Val Gin Lys 
200 205 



195 



25 Glu Asp He Glu val Tyr Phe Thr Gly Pro Gly Trp Glu Ala Arg Gly 

215 220 



210 



Ser Phe Ser Gin Ala Asp Vai. His Arg Gin Val Ala He Val Phe Arg 
30 225 - 230 235 240 

Thr Pro Pro Tyr Ala Asp Pro Ser Leu Gin Ala Pro Val Arg Val Ser 

*>-r *5^rt 25b 



245 



35 Met Gin Leu Arg Arg Pro Ser Asp Arg Glu Leu Ser Glu Pro Met Glu 

2 65 * ' u 



260 



Phe Gin Tyr Leu Pro Asp Thr Asp Asp Arg His Arg tie Glu Glu Lys 
275 280 285 

40 Arg Lys Arg Thr Tyr Glu Thr Phe Lys Ser He Met Lys Lys Ser Pro 
290 255 300 

Phe Ser Gly Pro Thr Asp Pro Arc Pro Pro Pro Arg Arg He Ala Val 
45 305 * 310 315 320 

Pro Ser Arg Ser Ser Ala Ser Val Pro Lys Pro Ala Pro Gin Pro Tyr 
325 330 3Jo 

50 Pro Phe Thr Ser Ser Leu Ser Thr He Asn Tyr Asp Glu Phe Pro Thr 
340 345 350 

Met Val Phe Pro Ser Gly Gin He Ser Gin Ala Ser Ala Leu Ala Pro 
355 360 365 

55 Ala Pro Pro Gin Val Leu Pro Gin Ala Pro Ala Pro Ala Pro Ala Pro 
370 375 380 

Ala Met Val Ser Ala Leu Ala Gin Ala Pro Ala Pro Val Pro Val Leu 
60 3B5 390 395 400 

Ala =>~o Gly Pro Pro Gin Ala Val Ala Pro Pro Ala Pro Lys Pro Thr 
" 410 415 



405 



65 Gin Ala Gly Glu Gly Thr Leu Ser Glu Ala Leu Leu Gin Leu Gin Phe 
— 4 25 4 30 



420 



70 



Asp Asp Glu Asp Leu Gly Ala Leu Leu Gly Asn Ser Thr Asp Pro Ala 
435 440 445 

Val Phe Thr Asp Leu Ala Ser Val Asp Asn Ser Glu Phe Gin Gin Leu 
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450 455 4«0 

Leu Asn Gin Giy lie Pre Vai Ala Pro His Thr Thr Glu Pro Met Leu 
465 470 4 * 

Met Giu Tyr Pro Glu Ala He Thr Axg Leu Val Thr Ala Gin Are Pro 



5 



43 



Pro Asp Pro Ala Pre Ala Pre Leu Gly Ala Pre Gly Leu Pro Asr. Gly 



10 " " 500 505 



L eu Leu Ser Gly Asp Giu Asp Phe Ser Ser lie Ala Asp Met Asp Phe 



515 

15 Ser Ala Leu Leu Ser Gin He Ser Ser 
530 535 
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Claims: 

1 . A nucleic acid encoding a chimeric transcription activator protein which: 

(a) activates transcription of a gene to which the chimeric transcription 
5 activator protein is targeted; and, 

(b) contains at least one composite transcription activation domain comprising 
a continuous polypeptide region containing two or more component polypeptide 
regions, at least two of which are mutually heterologous and do not occur in nature 
in the same gene product, and at least one additional domain comprising a DNA- 

10 binding domain. 



15 



2. A nucleic acid encoding a chimeric transcription activator protein which: 

(a) activates transcription of a gene to which the chimeric transcription 
activator protein is targeted; and, 

(b) contains at least one composite transcription activation domain comprising 
a continuous polvpeptide region containing three or more component polypeptide 
regions, at least two of which are mutually heterologous, and at least one additional 
domain which is heterologous with respect to at least one of the component 
polypeptide regions of the composite transcription activation domain. 



20 
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3 A nucleic acid of claim 2 which encodes a chimeric transcription activator protein 
which further contains at least one heterologous domain which binds to a cell permeant 
ligand. 

4 A nucleic acid of claim 2 which encodes a chimeric transcription activator protein 
which further contains at least on heterologous domain which comprises at least one DNA- 
binding domain. 

5. A nucleic acid encoding a chimeric transcription activator protein which: 
(a) activates transcription of a gene to which the chimeric transcription 
activator protein is targeted; and, 
30 (b) contains at least one composite transcription activation domain comprising 

a continuous polypeptide region containing two or more component polypeptide 
regions, at least two of which are mutually heterologous, and at least one additional 
domain which is heterologous with respect to at least one of the component 
polypeptide regions of the composite transcription activation domain, wherein at 
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least one of the heterologous domains is derived from an immunophtlin. 
cyclophilin, calcineurin, FRAP or DNA gyrase or from a receptor for tetracycline 
or ecdysone or another steroid. 



10 



15 



>. A nucleic acid encoding a chimeric transcription activator protein which: 

(a) activates transcription of a gene to which the chimeric transcription 
activator protein is targeted; and, 

(b) contains at least one composite transcription activation domain comprising 
a continuous polypeptide region containing two or more component polypeptide 
regions, at least two of which are mutually heterologous, a bundling domain, and at 
least one additional domain which is heterologous with respect to at least one of 
the component polypeptide regions of the composite transcription activation 
region. 

7 A nucleic acid composition comprising a first nucleic acid encoding a chimeric 
transcription activator protein of any of claims 3 or 5 (or a lbd-dependent claim of 2 or 6) 
and a second nucleic acid encoding a chimeric DNA-binding protein comprising at least 
one ligand binding domain for a cell permeant ligand and at least one DNA-binding 
domain. 

20 8. A nucleic acid composition of claim 7 in which the chimeric DNA-binding protein 
comprises two or more ligand-binding domains. 

9. A nucleic acid composition of claim 7 or 8 which further comprises a target gene 
construct comprising a target gene operatively linked to an expression control sequence 
which includes a DNA sequence to the the chimeric DNA-binding protein binds. 

10. A nucleic acid composition comprising a first nucleic acid encoding a chimeric 
transcription activator protein of any of claims 1 or 4 (or a DBD-dependent claim of 2 or 
6) and a target gene construct comprising a target gene operatively linked to an expression 
control sequence which includes a DNA sequence to the the chimeric transcription 
activator protein binds. 

30 11. A host cell comprising a nucleic acid of any of claims 1-6. 

12. A host cell comprising a nucleic acid composition of any of claims 7-10. 

13. A host cell of claim 1 1 or 1 2 which contains a genomically-integrated target gene. 

14. A method for producing a genetically engineered host cell, which method 
comprises introducing a nucleic acid of any of claims 1 - 6 into a host cell under 
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conditions permitting uptake by the cell of nucleic acids. 
1 5 Amethod for producing a genetically engineered host cell, which method 
comprises introducing a nucleic acid composition of any of claims 7 - 1 0 into a host cell 
under conditions permitting uptake by the cell of nucleic acids. 
5 16 A method for effecting expression of a target gene which comprises maintaining 
geneticallly engineered host cells of claims 11. 12 or 13 under conditions suitable for gene 
expression. 

1 7 A method of claim 1 6 in which the target gene expression is regulated, and the 
conditions for gene expression include the presence of a ligand which binds to one or more 
l o of the chimeric proteins . 
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